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APPLICATIONS OF TWO OSCULATORY FORMULAS 
By John L. Roberts 

INTRODUCTION 

The main purpose of this paper is to illustrate how Mr. Jenkins’ osculatory 
formulas 1 (A) and (B) can be applied in a convenient manner. The first section 
of this paper will be little more than a summary of some of the formulas con¬ 
tained in the other three articles. The second section will contain the appli¬ 
cations. 


I. SOME MATHEMATICS OF THE FORMULAS 

The Woolhouse notation will in this paper be used to stand for the differences 
of «»+„ which represents the given values of a function. The general formulas are 

Vx = l/o + zAj/o + h x ( x ~ 1)5 + ~ 1)(* ~ i)C; (1) 

h)C. ( 2 ) 


and 


V* - u o + xai + §x(x — 1)J3 + $x(x — l)(x 
The special formulas belonging to (2) are 

B = b — \d and C = c\ — \c x , (A) 

where b and d are defined by b — i(b 0 + 6i) and by d — \{<k + di); and 

B = b and <7 = 0. (B) 


The special formulas belonging to (1) are 

Vo *= Mo + Ibo, B = b, and C => 0; (C) 

and 

l/o = Mo — > B - b — id, and C = c x — fa. (D) 

Formula (C) is equivalent to Mr. Jenkins’ formula (A). Also (D) is equivalent 
to his formula (B). 


1 This paper presupposes a knowledge of three other articles. The first one by Mr. 
Wiltner A. Jenkins is entitled '"Graduation Based on a Modification of Osculatory Inter¬ 
polation/’ and is printed in the October 1027 issue of the Transactions of the Actuarial 
Society of America. The other two papers are mine. One of them is entitled “Some 
Practical Interpolation Formulas,” and is printed in the September 1035 issue of these 
Annals, The other one entitled “A Family of Osculatory Formulas” is printed in the 
October 1935 issue of the Transactions. 
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^ V 

II. APPLICATIONS OP (c) AND (d) 

First, there is the problem of selecting suitable examples to which (C) and (D) 
can be applied. Secondly, we will then apply in a convenient manner the 
formulas to these examples. 

The problem of selecting suitable examples will now be considered. “The 
non-reproducing characteristic of” formula (D) “raises the question of what 
will happen in the graduation of a series whose fourth differences are all posi¬ 
tive, say. The answer is that the graduated series will lie everywhere below 
the observed points and that the observations will not be correctly represented 
by the interpolated series.” On the other hand, if we select a series whose 
fourth differences change frequently in sign, (D) because of its non-reproducing 
characteristic has valuable smoothing possibilities. In like manner, (C) may 
be valuable when the second differences change frequently in sign. Mr. Jenkins ✓ 
gives at quinquennial ages rates of mortality which were graphically determined 
from the published American Men Ultimate Experience. Since the fourth 
differences of these rates change frequently in sign, we will apply (D) to a few 
of these rates. So far as I know no suitable actuarial examples have been 
found to which (C) can be applied. However, there is the possibility that (C) 
might be valuable in some sciences. Since I do not know of any suitable real 
example to which (C) can be applied, we will apply it to a trivial series whose 
second differences change frequently in sign. 

We are now ready to apply in a convenient manner (C) and (D) to the 
examples selected in the preceding paragraph. 

First, we will apply (C). I have in my other article applied (B) in a con¬ 
venient manner. This method with little change can be applied to (C). If 
it is desired to apply (C) at either end of the table where values of u x are not 
available for the calculation of the second differences, it can be assumed they 
vanish. It is convenient if S and S 2 represent respectively the major differ¬ 
ences A u 9 and A 2 u g in such a manner that they are arranged centrally in the 
working illustration. It is convenient if 8 and « 2 represent respectively the 
minor differences hy 9 and £y z . The quantity y 0 can be computed by y 0 = 
Uo + !&o, and yi can be computed in like manner. Since we wish in the working 
illustration of (C) to interpolate four values between y 0 and y\ , the middle 
8 * hy. 4 = .2Ay 0 , and s 2 = .045 = .02(b 0 + hi). We can by the use of the 
foregoing method apply (C) to suitable functions, whose given values can be 
represented by/(r). Then, it follows from the definition of u x that/(r) = u*. 
It might prevent confusion if it is stated that x and r are related to each other 
in such a way that we always interpolate between y 0 and yi . We shall now 
apply (C) to the case when /(r) represents the trivial series shown at top of 
page 3. 

Finally, we will apply (D). Mr. Henderson has applied (A) in a very con¬ 
venient manner. His method with little change can be applied to (D). If it 
is desired to apply (D) at either end of the table where values of u x are not 
available for the calculation of the differences required, it can be assumed 
that the fourth differences that can not be computed vanish, and the required 
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differences can be filled in consistently with that assumption. It is convenient 
if S f S 2 y and S* represent respectively the major differences Au x , A 2 u x , and 
A s w* in such a manner that they are arranged centrally in the working illustra¬ 
tion. It is convenient if s, s 2 , and s s represent the minor differences so that 
by definition $ = 8 X = by x , s 2 = s 2 = b 2 y x -.%, and s* = b*y x . The first 
8 = sV-.* = .04(6 0 — $do). The last $ 2 =* b 2 y. 8 == .04(bi — Jd x ). The quan¬ 
tity j/o can be computed by y* = Uo — ^do , and y x can be computed in like 
manner. The middle a = by A = .2Ay 0 — s z . We are now in position to 
apply (D) to the quinquennial rates of mortality. 


r 


Age 

Rate 

s 

s* 

S' 

8* 

72 

.07010 

.03808 




77 

.10818 

.04669 

.00861 

.01799 


82 

.15487 

.07329 

.02660 

- .01946 

- .03745 

87 

.22816 

.08043 

.00714 

.12572 

.14518 

92 

.30859 

.21329 

.13286 

12572 

.00000 

97 

.52188 


.25858 


.00000 
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Age 

V* 

6 

** 


82 

.15591 

12612 

.001314 


83 

.168522 

13527 

915 


84 

.182049 

.014043 

516 

- .000399 

85 

.196092 

14160 

117 


86 

.210252 

13878 

1 

-.000282 


87 

.22413 

13460 

-.000682 


88 

.237590 

13977 

.000517 


89 

.251567 

.015693 

1716 

.001199 

90 

.267260 

18608 

2915 


91 

.285868 

22722 

4114 


92 

.30859 

28006 

.005314 


93 

.336596 

34326 

6320 


94 

.370922 

.041652 

7326 

.001006 

95 

.412574 

49984 

8332 


96 

.462558 

59322 

9338 


97 

.52188 


.010343 





SOME SIMPLE DEVELOPMENTS IN THE USE OF THE 
COEFFICIENT OF STABILITY 


By C. H. Forsyth 

Some time ago the writer proposed 1 a coefficient of stability C» to be used 
to measure the stability of a statistical series, where that coefficient is defined 
by the relation 



where M denotes the arithmetic mean and a the square of the dispersion of 
the terms of the scries. It was proposed to regard series as unstable (Lexian) 
for which the value of the coefficient exceeded unity, and stable otherwise. 
The only essential way in which such a procedure differs in results from the 
traditional method is that it includes as stable those series for which the value 
of the coefficient lies between unity and q the probability of failure of the event 
under investigation—series which would be classed as unstable according to 
the traditional method. Stable series—according to either standard—are found 
so rarely in practice and therefore so many series are accepted as fairly stable 
which come anywhere near meeting the requirements that replacing q by unity 
as the line of demarcation affects the classification of no known series but 
adds to the effectiveness of the avowed purpose and use of the proposed coeffi¬ 
cient—to avoid the round-about work of computing values of probabilities. 
Another merit of the use of the coefficient is that it enables one to measure 
and therefore compare the stability of several series—a feature which we shall 
illustrate later. 

In brief, such a coefficient provides a means of introducing the whole Lexian 
theory into Federal publications such as those on vital statistics, since a com¬ 
parison of the values of the coefficient for, say different communities or countries, 
would be readily grasped by any reader, whereas the traditional method would 
prove too subtle and laborious, and allow no ready comparison of results. 

For purpose of orientation let us illustrate the situation by analyzing a simple 
series both ways—the traditional way and by the use of the coefficient of sta¬ 
bility. As an example, let us consider the death rates of white infants under 
one year of age for 1919 (considered on page 89 of the Handbook) for those 
states whose frequencies of births are comparable or which vary little from 

1 Journal of tho American Statistical Association, June, 1932. 
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their average of 47,830—where the number of deaths for 

each state has been 

adjusted to this average as a base. 




Adjusted 

Deaths X 

X — 3659 

(X - 3659)* 

Cal. 

3350 

-309 

95481 

Conn. 

4700 

1041 

1083681 

Ind. 

3732 

73 

5329 

Kan. 

3253 

-406 

164836 

Ky. 

3686 

27 

729 

Minn. 

3159 

-500 

250000 

N. Car. 

3541 

-118 

13924 

Va. 

3732 

73 

5329 

Wis. 

3780 

121 

14641 


9)32933 

1335-1333 

) 1633950 


M = 3659 


181550 * 




v = 426 

The traditional method would be: 



The mean M = 

np * 3659 where n = 

47,830. 



„ 3659 

Hel “ 1 ' » - 47830 

, 44171 

an<l, " 47830 



and <r*s - npq - 3659 - 3378 


whence <r B — 58.15 


which is the value of the dispersion we should expect if the basic probability 
were constant throughout. But the value of the dispersion proves to be 
<r — -\/l81550 = 426, and the comparison of the values shows that the basic 
probability to be very variable and therefore the series to be very unstable or 
Lexian. 

The computation of the value of the coefficient of stability is much more 
simple and direct 


r a 181550 
’ " M ~ 3659 


49.6 


whose excess over unity also clearly indicates the instability of the series. 

Since proposing the coefficient of stability the writer has been impressed by 
the overwhelming proportion of existing series (such as birth rates, various kinds 
of death rates, etc.) which employ arbitrary bases (such as “per thousand,” 
“per ten thousand,” etc.) usually without mention of the actual base. It is 
obvious, of course, that such rates, or occurrences per arbitrary base, say 6, 
can first be adjusted to give occurrences per actual base, say B (assuming that 
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base B* can be determined) but the work can evidently be performed much 
easier. For, since the original series (per arbitrary base b) Xi , X », ■ ■ ■ Xk 


B B 

would become, on adjustment, -rX^—Xt, 

b b 

JJ 

t- M and the square of the dispersion 
b 

cient of stability would become 


i°„ (|v), 


g 

• • T X*, the mean would become 
b 

whence the formula for the coeffi- 


C. 


L. i 

M ' b 


( 2 ) 


As an example, let us consider the general death rates, per 10,000, of New 


Zealand for the years 

1921-30. 

X 

X - 86 

(X - 86)* 

1921 

87 

1 

1 

1922 

88 

2 

4 

1923 

90 

4 

16 

1924 

83 

-3 

9 

1925 

83 

-3 

9 

1926 

87 

1 

1 

1927 

85 

-1 

1 

1928 

85 

-1 

1 

1929 

88 

2 

4 

1930 

86 

0 

0 


10)862 

10-8 

)46 


M = 86.2 


4.6 


This example illustrates the danger of using the coefficient of stability unless 
the series consists of actual occurrences or unless the actual base is given due 
consideration. Without due consideration of the actual base (here the popula¬ 
tion of New Zealand) one might easily fall into the error of regarding the value 
of the coefficient of stability as 4.6/86.2 and, therefore, the series as very 
stable. But the population of New Zealand is about a million and a half and, 
therefore the true value of the coefficient of stability is 


4.6 1,600,000 

86.2 10,000 


= 8.0 


* Strictly speaking, this actual base B should be constant throughout the series; other¬ 
wise the successive numbers of occurrences—the terms of the series—would not be com¬ 
parable. Where, however, the base B varies little from term to term—as usually happens 
even in the best of series, such as a scries of some kind of rates of the same community 
over a short interval—the variation can be ignored, in which case base B (to which the 
terms of the series are adjusted) usually means the arithmetic mean of the different bases. 
In the first treated above, the investigation was limited to certain states in an effort to 
comply with the rule just mentioned but the example is a poor one since the variations 
are still dangerously too large. The situation is saved by the conclusive results. 
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which shows the series to be unstable. However, before we condemn New 
Zealand’s death rates too severely, let us compare her record with those of 
other important countries, including our own, for the same period. 


General Death Rates (per 10,000) 



M 

c. 

New Zealand... 

. 86.2 

8 

Australia. 

. 94.3 

90 

Sweden. 

. 120.4 

96 

Scotland. 

. 137.3 

139 

Austria. 

. 151.1 

536 

United States.. 

. 118.0 

830 

England-Wales. 

. 121.3 

1117 

France. 

. 170.3 

1129 

Spain. 

.. 193.7 

2190 

Italy... 

. 163.5 

2760 

Germany. 

. 125.4 

6040 

Japan.... 

. 206.4 

6800 


These results show how extremely unstable most series of general death 
rates are and that the series for New Zealand, while unstable according to 
our strict criterion, enjoys quite an enviable position practically in a class by 
itself. Parenthetically, these results also illustrate fairly well the triviality, 
with respect to results, of replacing q by unity as the critical value of the coeffi¬ 
cient of stability, discussed at the beginning of this article. 

The values of the coefficient listed above would, of course, be reduced some¬ 
what in most cases if the trend of the series were first eliminated but the writer 
has gone though all this work and found it not worth while—that is, the series 
would still remain markedly unstable. 

Another development proves useful when, as frequently happens, the actual 
base B is unknown to a degree of accuracy desirable for use in formula (2). 

From the inequality 4 t ^ ^ 

M 0 

we obtain 



(3) 


which is to be used to show how small an actual base should be for the given 
series to be stable. As an example, let us consider the maternal mortality, 
per 10,000 live births, in the so-called expanding registration area of the United 
States. 
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Maternal Deaths in 

the United States (per 10,000 live births) (Expanding 


Registration Area) 



X 

Z-66 

(Z - 66)* 

1923 

67 . 

1 

1 

1924 

66 

0 

0 

1925 

65 

-1 

1 

1926 

66 

0 

0 

1927 

65 

-1 

1 

1928 

69 

3 

9 

1929 

70 

4 

16 

1930 

67 

1 

1 

1931 

66 

0 

0 

1932 

64 

-2 

4 


10)665 

9-4 

)33 


66.5 


3.3 


fifi ft 




OO ff 

Hence, by formula (3), B g (10,000) or about 200,000. The number 

O.u 


of live births varies so greatly that we should probably find it impossible to 
agree upon a satisfactory number 2 to use as an actual base for such an “ex¬ 
panding area” but we should all agree that it would be so much greater than 
200,000 that the instability of the series would be unquestioned. 

One must be careful in comparing the results of two or more investigations 
like the one just conducted. For example, the analogous result for Canada 
for the same period yields B 113,000 and we might conclude, too hastily, 
that the United States series is more stable (or less unstable) whereas any 
knowledge whatever of the numbers of live births of the two countries would 
show that Canada comes much closer to fulfilling her requirement than the 
United States and that the palm must go to Canada. For one thing, Canada 
has about the population of New York city and New York city has about 
100,000 live births annually. In any case, close decisions in matters of this 
kind would be difficult without sufficient information in regard to actual bases. 

There is still another situation which is interesting but of much less impor¬ 
tance because of the rarity of its occurrence. It will be recalled that the coeffi¬ 
cient of stability was devised mainly to avoid the use and computation of 
probabilities and that the only difference between the results by the traditional 
method and by the use of the coefficient bf stability lies in the trivial replace¬ 
ment of the critical value q by unity. In the traditional method of analysis, 
but by comparing the value of the coefficient of stability with q, the coefficient 
is evidently always, strictly speaking, a function of the actual base B. In 
other words, there is no statistical series, however stable it may seem—except 


* It was in the neighborhood of two million in 1932. 
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for the trivial case when all the terms of the series are exactly the same—but 
what would be unstable if the base were small enough. It is possible to formu¬ 
late the limi t, once for all below which the given (otherwise seemingly stable) 
series would prove unstable. 

If, in the relation a ^ npq (for stability) we replace p by M/n, q by 1 — M/n 
and then n by B, we obtain 

& £ M -=- or % M — <j 

D JtS 


whence, finally 


B £ 


AT* 

M -a 2 


(4) 


where the transference of the term M — a from one side to the other should 
cause no apprehension since, by hypothesis, a* < M and M — «r* is therefore 
always positive. We propose to employ formula (4) in those rare cases where 
the value of the coefficient of stability of actual occurrences—but without 
reference to an actual base—is less than unity—that is, where the given series 
proves to be stable according to the method proposed by the writer—and 
determine the upper limit of the values of the base B for which the series would 
be unstable according to the traditional method of analysis. As an illustra¬ 
tion, let us consider the familiar series of annual football fatalities in this country 
for the period 1906-1930* (omitting the years when no records were kept). 


Football Fatalities 


1906 

11 

1917 

12 

1907 

11 

1921 

12 

1908 

13 

1923 

18 

1909 

12 

1925 

20 

1911 

11 

1926 

9 

1912 

13 

1927 

17 

1913 

5 

1928 

18 

1914 

13 

1929 

12 

1915 

15 

1930 

13 


It is easily verified that C, = which is clearly less than unity; whence 

13.055 

the series clearly seems stable. Applying formula (4) 


B ^ 13.055 2 

= 13.056 - 11.942 


or 153 


which shows that the given series is stable as long as the total number of foot¬ 
ball players exceeds the number 153. A recent news item quoted an estimate 
of the number players participating in games of four hundred colleges as about 
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13,000 and over 600,000 including high schools and all. We can then definitely 
say that the series just considered is stable. Such a conclusion has no bearing, 
of course, upon what might happen if other terms were added to the series. 
It happens that adding the records for the next five years—1931(33), 1932(32), 
1933(27), 1934(25), 1935(30)—would change the whole series to an unstable 
one with C, ■ 56.9/16.6 = 3.4; but, obviously, the additional records belong 
to a new regime of collection. 



INTERNAL AND EXTERNAL MEANS ARISING FROM THE SCALING 
OF FREQUENCY FUNCTIONS 

By Edward L. Dodd 

The scaling 1 of frequency functions has been discussed from the standpoint 
of maximum likelihood. But the likelihood criterion to be satisfied sometimes 
leads to a minimum likelihood; and sometimes to neither a maximum nor a 
minimum. Scaling will be studied in this paper with reference to the likelihood 
actually secured, and also with reference to the character of means obtained, 
whether internal or external. 


SECTION 1. INTRODUCTION 


It is well known that a scale obtained in a curve-fitting process is sometimes 
a mean. Thus, with the normal function 


( 1 ) 


i ~~(xla) */2 

aV tor 


if the scale a is to be obtained from measurements, x lf x*, • • • , x n , we com- 
monly accept the value 

(2) a = {i £ z-j 1/2 ; 

that is, the root-mean square of the measurements. Here, the positive value 
of a is naturally taken. It is called the standard deviation, and thought of as 
an appropriate new unit of measure. 

But even with the x’s all negative, and the a taken positive, O. Chisini 2 con¬ 
sidered it proper to regard a as a mean of the x’s, albeit an external mean. 
From Chisini’s viewpoint, this a whether regarded as positive or negative is 
primarily a solution of 


(3) x\ + x\ + • • • + = a 2 + a 2 + • • • + a 2 . 


In this sum of squares, the single number a may be substituted for each of the 
x’s . Perhaps this kind of mean should be called a substitutive mean to dis¬ 
tinguish it from the means of general analysis which are always internal. 


1 Fisher, R. A., “On the mathematical foundation of theoretical statistics,” Philo¬ 
sophical Transactions of the Royal Society of London, Series A, Vol. 222, 309-368, (1921). 
See p. 338. 

* Chisini, O., “Sul concetto di media,” Periodico di matematico, Series4, Vol. 9,106-116, 
(1929). 
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The normal function is a particular case of a more general function: 

(4) Constant- cT 1 e* <0 , — —t p /p, t = x/a. 

The likelihood method to find the scale a for this function leads to power means, 
including the arithmetic mean, the root-mean-square, root-mean-cube, etc., for 
p =* 1, 2, 3, etc. 

The word scale will be used only for a positive number,—which then may be 
regarded as a unit of measurement. 

For measurements, xi, x 2 , - - • , x* Chisini regarded las a mean, relative 
to a function G , provided 

(5) G(x l} x 2) ... ,x«) « Cr(ilf, Jlf, - , M). 

If a solution of this equation is 

(6) M = F{x i, x 2 , • • • , x n ) } 

and c is a possible value for the x's, it follows at once that 

(7) F(c, c, • * • , c) — c, 

or at least one value of this F is c. Conversely, if (7) is satisfied, it is but a 
change of notation to replace c in (7) by Af, and to combine this with (6) to 
obtain 

(8) F(xi,xa, ... ,x„) = F(M, Af, - - • , Af). 

Hence, this F which in (6) gives explicit form to the implicit M found in (5) 
may also be thought of as a mean-forming function, such as G in (5). Briefly, 
F is a particular G. Thus F(x i, x 2 , * - • x n ) is a mean of Xj, x 2 , * *. , x n , if F 
is so constructed that (7) is satisfied when the arguments are all equal. 

Inasmuch as a frequency function f(t) is non-negative, log* fit) is real,—say 
<t>(t) plus constant. Following R. A. Fisher, it will be convenient to write 

(9) fit) = CaT l c* {t \ C = Constant 

With location m already determined, the x’s will be thought of as measured, 
from m. And we set 

(10) t = x/a, U == x,/a, i = 1, 2, •. • , n. 

The “productive” probability — to yield Xi, x 2> ••* , x n —is then 

(11) L — II/(<<) = C n a~ n e s * {ti) . 

This is proportional 3 to the “likelihood” of a. Also—it may be noted in 
passing—the productive probability is also proportional to the a posteriori 
probability, if a constant a priori probability is postulated. The likelihood 
will here be taken as HfiU) itself; and it will be designated by L,—in Fisher's 


* Loc. Cit. f Fisher, p. 310. 
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notation, L = log II. Of course, n and log n take maximum values simul¬ 
taneously, if at all. From (11) it follows 4 that 

(12) -ad log L/da = n + - 2{^'(f,) + 1). 

The equation 

(13) 2t«t>'(U) + n * 0 (* - 1, 2, • • • , n) 

will be called the likelihood condition, whether this leads to maximum likeli¬ 
hood, to minimum likelihood, or to neither. A second differention 6 leads to 

(14) o* 3* log L/da* = 2:&"(*<) - » - 2{fo"(fc) - lj. 

When negative, this indicates a maximum likelihood; when positive, a minimum 
likelihood for the a obtained from (13). 

Preparatory to the theorems of the next section, just one more matter will 
be discussed. The unit for t is arbitrary; and it may be convenient to write, 
with k j* 0, 

(15) = 4>(ku) = $(«), t = ku. 

Then 

(16) 

Suppose, now, that a positive constant k can be found such that k4>'{k) «= —1. 
Then, with t = ku, as postulated, 

(17) 1S'(1) - Jfe*'(fc) = -1. 

Thus 4>'(1) = — 1,—or as it will now be written — — 1,—is no more 
restrictive than the condition that some positive k exists such that k4>'{k) = — 1. 

SECTION 2. GENERAL THEOREMS CONCERNING THE SCALE AS A MEAN 

Theorem I 

Given the frequency function 

(18) f(t) * CcT 1 e 4( ‘’, t = x/a, t ( = Xi/a, C = Constant. 
And suppose that 

(19) *'(1) = -1. 

Suppose, also, that for given x lt xt, ••• , x„, the likelihood condition (13), 
now written 

(20) 2* (Xi/a)<l>'(xi/a) + n = 0, 

4 Loc. Cit., Fisher, p. 338. 

6 Loc. Cit., Fisher, p. 339. 



SCALING OF FREQUENCY FUNCTIONS 


15 


has a positive solution. 

(21) a ** F(x i, a, 

Then this a, the scale, is a mean. 

Proof . With each X{ * 0, (20) cannot be satisfied. 

But if, with c ^ 0, we take each Xi = c, and at the same time set a *» c, 
then, by (19), 2 «* —n; and thus (20) which gives a implicitly is satisfied. 
The explicit a in (21) is therefore such a function F that (7) is satisfied. Hence, 
the scale a is a mean. 

Theorem II 

Given the frequency function 

(18) f(t) = CaT 1 e* {t \ t = x/a y ti = Xi/a } C = Constant. 

Suppose that 

(19) *'(1) - - 1, 
and that 

( 22 ) |^'( 0|<1 if | /1 < 1 . 

Moreover, suppose that the likelihood condition (20) for measurements 
Xi, Xj, * • • , x«, has a positive solution a. Then 

(23) a ^ Maximum | Xi |. 

Or, suppose that, in place of (22), we have 

(24) | «*'(*) I >1 if Ml > i; 
and that ttfit) keeps the same sign, if 1 t 1 > 1. Then 

(25) Minimum | x» | g a . 

Proof. Suppose, if possible, that a > Max | x» |. Then each | x,-/a | < 1, 
and by (22), | (xi/a)<l>'(xi/a) | < 1. Then (20) is not satisfied, since | 2 | < n. 
Thus the hypothesis is contradicted. 

Now (25) is satisfied at once if any x< = 0. But suppose, on the other hand, 
that Min | x t -1 > 0; and, if possible, that a < Min | x< |. Then, by (24) et 
seq., since | Xi/a | > 1, it follows that | 2 | > n. And thus (20) is again con¬ 
tradicted. 

Theorem III 

Given the frequency function 

(18) f{t) = CaT l e* {i \ t = x/a, U = x,/a, C = Constant; 

and set ^(f) = Wit) + 1. Suppose that 

(26) lim \p{t) « a, lim $(t) = /3, afi < 0. 

1-0 1 t J — oo 
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And suppose that ^(<) is continuous when tv* 0. 

Then, for any set of real numbers, xi, x%, •••,*», of which none is aero, 
there exists a positive number a, as scale, such that the likelihood condition 

(20) (x,/o) <A'(x,/o) + n = 0 

is satisfied. 

The conclusion is also valid, if in place of the limit /3, there is postulated 

(27) lim *(f) = — a | « | = lim \p(t), 

t *■"* —& 4-0 t —*c —0 

where b > 0, c > 0, and is continuous for — b < t < 0 and for 0 < t < c. 
That is, the new limits are to be infinite with sign opposite to that of a. 

Proof. The limits for t 0 and for 1 1 1 —► oo are the same as the limits 
for a ► qo and a — * 0+, — noting that t = x/a, x 0. Thus 2\l/(ti) changes 
sign as a goes from 0+ to «. Hence, since is continuous, (20) is satisfied 
for some positive a. 

For the proof of the second part of the theorem, suppose that x n > 0 and 
that x n is the greatest x»-. Then with a > xjc , but approaching x n /c, t(x n /a) 
becomes infinite with sign opposite to that of a. Furthermore, in 2^(x,/a), 
the positive x’s < x n have a negligible effect; and thus lim 2^(x,-/a), as 
a —*► ( xjc ) + 0, is infinite with sign opposite to that of a , when this sum 2 
is taken for the positive x’s. Likewise, if X\ < 0, and is the least x,, lim 2^(x»/a), 
as a —► (— Xi/6) + 0, is infinite with sign opposite to that of a ) when this sum 
is taken for the negative x’s. If, now, the measurements happen to be all 
positive, we think of a as approaching xjc + 0; and the continuity condition 
leads to an a which makes 2^(x t /a) = 0. Likewise, if the measurements 
happen to be all negative, we use —x x /b + 0. If both positive and negative 
x’s appear, we use the greater of the two ratios — xjb and x„/c. 

SECTION 3. SOME FAIRLY REGULAR FREQUENCY FUNCTIONS 

To illustrate the foregoing theorems in a somewhat general manner, consider 
the measurements, Xj, X 2 , • • • , x n , and with t = x/a, U = x»/a, set up the 
function: 

(28) m = Car 1 1 kt r (1 + fcYp e~ r 1 kt 

where, as before, C is a suitably chosen constant. 

Suppose also that 

(29) p > — 1, 3^0, r ^ 0, s S 0; 

and that either 

(30) r > 0, 8 > 0 or r = 0, 2q > p + 1. 

Then with = log /(<), it follows that, when 1 ^ 0, 

(31) . + 1 = (p + 1) - rsfc* 1 1 1* - 2gfcY(l + JfcY) -1 . 
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Now the condition l-tf>'(l) = —1 would be satisfied if ¥(fc) = 0, where 

(32) ¥(k) = rsk‘ +i + rsV + (2 q - p - l)fc* - (p + 1). 

But, under the conditions (29) and (30) ¥(0) < 0, and ¥(») > 0. Hence, 
there is a positive k for which 't'(fe) = 0. Then if k be assigned this value, 

(19) is satisfied; and by Theorem I, any scale a that the likelihood condition 

(20) may lead to is a mean. But, by Theorem III a scale a will actually exist 
—indeed, for any positive k that may be used in (29); since the limit of ty'(f) + 1 
is positive as t —*■ 0, and is negative as 1 1 1 —► ». 

Moreover, if in (29), the further condition — 1 < p 2s 0 is introduced, (22) is 
satisfied. And, thus, a S Maximum | x< |. Also, | t4>'(t) | increases with 1 1 1. 
Hence, by (24) et seq., Minimum | x< | g a. 

If in (28), we set q — 0, * = 1, r > 0, and confine our attention to positive 
x and t, there is obtained the Pearson Type III. Reference to (32) shows that 
’j'(fc) = 0 if k — (p + l)/r. With this substitution, 

(33) /(<) = C 1 a~ l t p ', C" = Constant. 

Since <f>'( 1) = — 1, any solution of the likelihood condition is a mean. Here, 
with t > 0, W>'(t) = p — (p + 1)<, and <V'(0 — 1 = — (p + 1). From (14) 
we see that, with p + 1 > 0, any mean obtained corresponds to maximum likeli¬ 
hood and the single maximum found is actually the largest value. Moreover, 
with the measurements, Xi, x*, • • • , x„, all positive, a scale a will exist,—as 
noted in the general case (28). 

In passing, it may be noted that Type III appears® rather naturally in a 
form giving <£'(1) = — 1 at once, without any transformation. Here, then, a 
scale is a mean. 

Given the Pearson Type I in the form 

(34) f(t) — Ca -1 (6 + kt) p (c — kt) q , t = x/a, b > 0, c > 0, | pq | > 0. 

If P + 9 + 1 > 0, it is possible to find a positive k so that with </> = log /, 
</>'(l) = — 1. In this case, any scale found by the likelihood condition is a 
mean. With k thus chosen, f(t) has essentially the same form as it would have 
if k = 1. Hence for convenience, let us simply set k — 1 in the above equation. 
Then for ~b < t < c, 

Ht) = Ut>'(t) + 1 = 1 + pt{b + f)~‘ - Qt(c - t)~ l . 

Suppose now that p > 0 and q > 0. Then Theorem III maybe applied; since 
lim i>{t) — 1, as t —* 0; but lim ^(<) —* — » f as t —* —b + 0, or as t —* c — 0 . 

* Carver, H. C., Handbook of Mathematical Statistics, Chap. VII, see p. 105, Line 4, 
noting that 4' = y'/y. 
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Hence a scale a satisfying the likelihood condition exists. Moreover, the likeli¬ 
hood is at a maximum; since, with —b<t<c, 

<V'(0 — 1 = —pt*(b + f)~* — gf 2 (e — 0~* — 1 < 0. 

This maximum is also the largest value for all values of a. 

If the Pearson Type IV is given in the form 

(35) f(t) = Co _1 (l + fcY) _p e°‘ rctatt “, t - x/a 

then if p > 1/2, it is possible to find a positive k which will make <£'(1) = — 1. 
In this case, any scale a is a mean. Moreover—for any k 0 —the limit of 
ht>'(t) + 1 is 1 for t —» 0 and is 1 — 2p for t —» » . Hence, by Theorem III, 
if p > 1/2, as above, then a scale a exists satisfying the likelihood condition (20). 


SECTION 4. FREQUENCY FUNCTIONS WITH CERTAIN PECULIARITIES 

The theorems of section 2 give sufficient conditions, which in some cases 
may not be necessary. Nevertheless, by violating certain hypotheses, particu¬ 
lar functions may be set up which exhibit various peculiarities. 

For the Pearson Types, the differential equation is 


(36) 


y’(0 _ _ do + dit 

y(t) ~ * w ~ b 0 + b[ + 6/’ 


t = x/a. 


The determination of a positive scale a by the Fisher likelihood process is 
impossible here, in case oo = 0, a\ > 0, bo + bit -f- bit 2 > 0. For in this case 
t4>’\t) S 0; and thus (20) cannot be satisfied. The U-shaped Type II curves 
are in this class. Likewise, if ao t 4 0, ai = 0, and bo + bit + bit 2 > 0,—for 
example, with 5* > 0, b\ < 4W>2,—and the measurements all happen to have 
the same sign as o 0 , such scaling is impossible. 

For the purpose of constructing peculiar functions we may take c > 0 and 
require that the measurements x< be either — c or c—with at least one — c and at 
least one c —and that <K0 be an even function. Then <t>(—c) — <l>(c) and (11) 
becomes 


(37) 


L = [CaT 1 e* (cla) ]\ 


The likelihood condition (13) reduces to 


(38) 0 = Ht) = <*'(<) + 1 - (c/a)<j>'(c/a) + 1, 


with the right member an even function of c/a. And from (14), a maximum 
likelihood is indicated when 


(39) (c/a)V'(c/o) - 1 < 0, 

with the left member likewise an. even function. A minimum likelihood is 
indicated if the left member is positive. 
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Let us apply this to the case where 

(40) *«) = (-2/3) log (1 - 3 1<1); <*'«) - 2 1<1 (1 - 3 1<1)" 1 . 

The likelihood condition (38) is satisfied only when < = ±1. Also = — 1. 
Thus the only means are the internal means ±c; and the only scale conformable 
to (38) is a = c. But this has minimum likelihood; since 1 -tf>"(l) — 1 = $ > 0. 
For positive t, this function (40) is a Pearson Type. 

Consider next a function of the form (28),—with p = —1.25, q = —0.5, 
however,—for which (31) becomes 

(41) <*'(<) + 1 - -1/4 - <74 + <7(1 + < 2 ) = -(1 - < 2 )74(1 + <*). 

whence ^'(1) = — 1, <t>"W — +1, ^'"(1) = —3. Here the likelihood condi¬ 
tion (38) has but a single absolute solution | < | = 1, leading to the single scale 
a — c, and to the two internal means, ±c. But, in this case 1 -#"(1) — 1 = 0, 
so that 3 s log L/da 2 = 0. Moreover, for t = 1, 3 3 log L/3o* = a ~* ^ 0. Thus, 
the only scale obtained by the likelihood method (38)—viz., a = c— has a 
likelihood which is neither at a maximum nor at a minimum. 

Another anomalous function is that given by 

(42) 4>(t) = t* - 2.5< 2 , < = ± c/a. 

The likelihood condition (38) leads to 

m = (1 - < 2 )(1 - 4 1 2 ) = 0 . 

The only solutions are < = ±1, giving internal means ±c;and<= ±1/2,giving 
external means ±2c. And from (39) et seq., it can be shown that the internal 
mean and scale, a — c has minimum likelihood, while the external mean and 
scale, a = 2c, has maximum likelihood. 

But it will be noted that a maximum value for a vicinity does not always 
signify a largest value for the entire possible range. Indeed, for the function 

(42) , a — 2c has maximum likelihood without having the largest likelihood. 
To avoid such an anomaly, a necessary condition is that as | < | —► », 
\p(t) —» — * ; as seen by taking the logarithm of L in. (37), noting that as a -* 0, 
(—log a) —» + oo. 

Finally avoiding the anomaly just mentioned, let us set up a frequency 
function, using the #(t) in (38), and writing 

m = 1 + <*'(<) = (1 - 2< 2 )(1 - < 2 )(1 - 0.9< 2 ). 

From this it follows readily that 

(43) <t>(t) = K - 1.95< 8 + 1.175< 4 - 0.3<®, K = Constant. 

This, with U = =tc/a, leads to an internal mean or scale a = c with minimum 
likelihood, a nearby scale a = c y/ 0.9 with maximum likelihood—differing 
indeed only slightly from the minimum just mentioned—and another scale 
a = cy/2 having maximum likelihood, and this likelihood is indeed greater 
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than that for any other positive value of a. The external mean a « cy/2 
in this case has the largest likelihood. This may be checked by the use of 
the logarithm of L as it appears in (37), in which the important part is 
*(c/c) - log a. 

In passing it may be noted that if ^(t) has the form f(t) » (1 — t)H(t), with 
H{ 1) 9* co, and U — i</o; then any solution a of the likelihood condition 
4>(t) m 0 is a meanby Theorem I. 

SECTION 5. SUMMARY 

When the R. A. Fisher likelihood method is used to find an “optimum” scale 
for frequency functions, it sometimes happens that this scale is a well known 
mean or at least is a substitutive mean—See Equation (5). Or a simple trans¬ 
formation (15) may often put the frequency function into such a form. Con¬ 
ditions are given under which a scale will be a mean. Under further condi¬ 
tions this mean will be internal—at least as regards absolute values. Finally, 
under certain conditions, a scale will exist. 

But for certain functions not satisfying these conditions, anomalies appear. 
The scale given by the usual likelihood condition may be a scale with a minimum 
likelihood. Sometimes the likelihood will be at neither a maximum nor a 
minimum. In certain simple cases, no scale exists. Furthermore, it may 
happen that the scales which are internal means have minimum likelihood and 
those that are external means have maximum likelihood. Among Pearson 
Types are found both anomalous functions and functions which would be 
regarded as regular as regards maximum likelihood. 

In this problem of scaling, likelihood is proportional to a ‘posteriori probability 
with the a priori probability taken as constant. 



MOMENTS OF ANY RATIONAL INTEGRAL ISOBARIC SAMPLE 
MOMENT FUNCTION 

By Paul S. Dwyer 

Introduction 

The problem of moments of moments has been investigated by a number of 
authors. The assumption of an infinite universe (or that of a finite universe 
with replacements) permits the application of the “algebraic” method, the 
method of semi-invariants as introduced by Thiele (1) and developed by C. C. 
Craig (2) and the combinatorial analysis method introduced by R. A. Fisher (3) 
and used by N. St. Georgescu (4). A combinatorial analysis method has the 
particular advantage that it enables one to compute separate terms of a given 
formula. 

The formulae foT moments of moments have been simplified through the 
use of new moment functions. Thiele introduced the half-invariant (1) which 
resulted in considerable condensation. More recently Prof. R. A. Fisher (3) 
has introduced the sample function k whose expected value is a half invariant. 
The most compact formulization presented thus far is his formulation of the 
half invariants of the sample k r in terms of the half invariants of the universe. 
This very compactness, however, makes it difficult to compare results with 
those expressed in the more conventional sample functions. Dr. Wishart has 
written a paper (7) in which he shows, among other things, how the Fisher results 
can be translated to the more conventional (Craig) results and vice versa, but 
such translation is in general no simple matter. It appears that the Fisher 
results are not immediately useful to the statistician who desires the formulae 
to be expressed in terms of the usual sample moment function. On the other 
hand the Fisher formulization is a remarkable discovery toward that harmony 
which must be naturally inherent in the field of moments of moments. Soper 
(6, 111) expressed the general situation when he wrote, “If the terrifying over¬ 
growth of algebraic formulation accompanying this branch of statistical inquiry 
is destined to have a chief utility in induction and going back to causes, then 
perhaps Dr. Fisher’s way of estimating a sample will prove to be most fertile, 
but if it is to be applied to problems of deduction, say to problems of suc¬ 
cessive eventuation such as propagation, then Mr. Craig’s plain moments seem 
to have a firmer hold on the exigencies of time.” 

It would appear then that the Fisher formulae and the Craig formulae are 
both needed. Georgescu (4) showed a partial connection between them in 
applying to the m functions a combinatory analysis somewhat similar to that 
applied by R. A. Fisher to the k function. It is the purpose of the present 
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paper to work out a combinatorial procedure for a more general sample function 
so that either the Fisher or Georgescu combinatorial results come out as special 
cases. In making such a generalization no limitation is placed on the sample 
function except that it be rational integral and that all terms are of the same 
weight. Thus the results are applicable to m*, m T + k r , m r k r , etc. as well 

. - fft 

as to m r and K although they are not applicable to Vm»- or -r-. this wa y 

fC r 

the important formulae for the moments of a new sample moment function 
will be available by simple substitution as soon as any such new function is 
defined by a rational integral isobaric expansion of power sums. 

It is thus the purpose of this paper to determine the moments of a general 
moment function of the sample. This is done by keeping the multipliers of 
the various partitions of power sums indefinite until all manipulation is complete. 
It is then possible to assign the definite values of these multipliers which are 
associated with the desired sample function and to obtain the moment of 
the desired moment function in this way. Thus the Fisher result k( 42), and 
the Craig result Sn(v 4 , v*) are special cases of the new result Xu(/ 4 , /a). It 
is obvious that it is not possible to carry the results using these general moment 
functions as far as Fisher and Wishart (3), (5), (7), have carried the results of 
the decidedly advantageous (from the standpoint of simplicity of result) k func¬ 
tion and yet it is surprising to find the simplicity which can be obtained in 
the general case. Incidentally the introduction of the more general symbols 
clarifies the successive steps of the partition analysis which are somewhat con¬ 
fusing in any specific case because of the insertion of the value of the coeffi¬ 
cients of the power sums in which the sample moment function is expressed. 

This paper is divided into three parts. The first part includes the necessary 
definitions, the basic formulae, and the general development of the algebraic 
method. In order to facilitate the algebraic work there is inserted a table giving 
the expected values of all possible partition products of power sums whose 
weight 58. The second part deals with the different sample functions which 
might be used. The third part gives a list of the various partition formulae, 
of weight 58, which contain no unit parts and shows how these can be used in 
writing the chief variations of the formulae for moments of moments. 

Part I 

1. General Moment Functions. Different moment functions have been de¬ 
fined in various ways, but all moment functions have in common the property 
that they may be expressed in terms of the power sums. It appears sensible 
to use this expression in terms of power sums as the working algebraic definition 
of moment functions. For example the function k 3 , which is defined by R. A. 
Fisher to be that function of the sample whose expected value is the third 
cumulant (half invariant) is to be given the working definition of 

k _ n(3) _ 3(2) (1) 2(1) (1) (1) 

3 (n — 1) (n — 2) (n — 1) (n — 2) n(n — 1) (n — 2) 
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where the numerical expressions in parentheses indicate power sums of the 
sample. 

Every term in the definition of a sample function has a “weight” which is 
equal to the sum of the power sums whose product is indicated by the term. 
Thus the weight of each of the terms of is 3. If all the terms of a given 
moment function have the same weight, the function is called isobaric and 
the weight of the function is equal to the weight of each term. Thus k% is an 
isobaric moment function and its weight is 3. Since all the functions so far 
proposed are isobaric we limit this generalization of moment functions to iso¬ 
baric moment functions although it is possible that a more complex analysis 
could be worked out for non-isobaric functions. 

Generality demands the inclusion of every possible partition product of 
power sums. Such generality can be obtained by writing 

fi * a[(l) 

f 2 = 02 ( 2 ) + o(i(l) 2 
/s = 03 ( 3 ) + a2i(2)(l) + am(l) 8 
fi = 04(4) + a 8 i(3)(l) + a 2 *(2) 2 + a 2 ii(2)(l) 2 + ai4(l) 4 
and in general 

fr = £ OpT 1 (Pi)" (ptY* ■ ■ ‘ (P.)'* 

where (pO* 1 (ptY* • • • (p*)** indicates any partition product of power sums, 
a p \ 1 ... is its coefficient and the summation is taken for every possible parti¬ 
tion. The number of parts of the partition is p = Sr. It may be assumed, 
without loss of generality, that the partition is ordered, i.e. 

Pi ^ P 2 ^ Pa ^ ^ p,. 


A natural numerical coefficient of each term is the number of ways the r 
units can be collected to form the given partition. This value is given by 

/ ir \ r! 

\pV vV • • • p; 7 ~ (PiO ri (P 2 O ' 2 • • • (p. o r * wii *ii • ■ • r.r 


If we set 


/ r \ 

a vV pI* ~ 1 T1 T ) a p T i l pT # 

\pi 1 • • • V, 7 


the definition of f r becomes 


fr ~^( P v... P :) apV ' pT, ‘ w 


1 • • • w 


In the present paper the capital letters are used to represent the corresponding 
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functions of the universe as defined by the corresponding power sums of the 
universe. Thus 


F r m 




(Pi)" ••• (P'Y 


represents the corresponding function of the universe. In the case of the 
moment about the mean and the semi-invariant the Greek letters n and X have 
been used to represent the corresponding function of the universe. In the 
case of functions whose notation is quite widely established, it is preferable to 
use the conventional notation, but in introducing new functions it appears 
wise to use the relationship between small and capital letters since the corre¬ 
spondence between the English and Greek alphabets is not exactly one to one. 
It should be particularly noticed that this notation does not agree with a pre¬ 
viously accepted scheme of using the small English letter to indicate the function 
whose expected value is indicated by the corresponding Greek letter. In the 
present paper it is not the expected value property which serves as the basis 
of notation but rather the definition of the function in terms of the partition 
products of power sums. 

2. The Working Definition of Moments About a Fixed Point. The sample 
functions defined by 






n 


are obtained from/, by placing 

fl 


doT 1 • • • 


n 


when 8 = 1, ir\ = 1, and pi = r. 


10 in all other cases. 

The Greek p* is used to indicate the corresponding function of the universe. 

3. The Working Definition of Moments About the Mean* The moments 
about the mean are defined by 

/ (1) m (2) (1)(1) 

1 “ T’ "■ “ T " “’ 


Wlj = — — 

n 


(3) 3(2) (1) , 2(1) 


+ 


n 2 1 n» ’ 


„ (4) 4(3) (1) , 6(2) (l) 2 3(1) 4 

Wi = ^~ + - 
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and in general m, is obtained from f r by placing 


• • • vZ* 


- if s * 1, n = 1, and pi = r. 
n 

(-lr* 

if pi > 1, xi = 1, s = 2, and p* - 1. 

( —l) r_1 (r — 1) , . 

— if p 1 = 1, s = 1, ana ti = r . 


10 in all other cases. 

The corresponding moments of the universe are indicated by the conventional p. 
For conciseness moments about the mean are referred to as “moments.” 


4. The Working Definition of the Half Invariants. The half invariant 
moment functions of Thiele, as applied to the sample power sums are [see C. C. 
Craig (2, 7-10) and Frisch (12, 20-21)] 

,/ * (1) , _ (2) _ (1) (1) . (3) _ 3(2) (1) 2(1)’ 

1 n' n n 2 1 3 * n n 2 n s 

(4) _ 4(3) (1) _ 3(2)* 12(2) (l) 2 _ 6(1) 4 

1 n n 2 n 2 n 8 n A 


and in general 




(~ir‘(>-1) 

n p 


1 , Wr(p*r ■••(p.)'* 

\pV • • • p . 7 


so that 


. . . r,^* 


(-ir 1 (p - D! 


n" 


The corresponding moments of the universe are indicated, after Thiele (1) 
and Craig (2), by A. R. A. Fisher (3) used k while Georgescu (4) used s. 

In the present paper these functions are referred to as “Thiele moments.” 


5. The k Functions of R. A. Fisher. The k statistics of R. A. Fisher are 
defined in terms of the sample power sums by 


k< 


k[ 



_ J2)_ (l) 2 

a n — 1 n(n — 1)’ 


, »(3) 3(2) (1) 2(l) 3 

3 “ (« - 1) (n - 2) (» - 1) (n — 2) »<*> 

«(n + 1) ( 4 ) 4(» + 1) (3) (1) 3(2) s 12(2) (1)* 

(n - 1 )® (n - !)<*> (n - 2)® + (n - 1)<*> 


6 ( 1) 4 
n«> * 
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These values and values for fa and fa are given by R. A. Fisher (3, 203-4) 
while algebraic methods of attaining them are presented in sections 16, 17. 
They are referred to as Fisher moments. The corresponding functions of the 
universe, if used, would be represented by K r . 


6. The h Function. Just as Fisher introduced a sample function whose 
expected value is a Thiele moment of the universe, so it is possible to introduce 
a function whose expected value is a moment of the universe. Such a function 
is defined by 


fa 



fa 


n( 3) 

(n — 1) (n — 2) 


(2) (1)* 
n — 1 n(n — 1)’ 

3(2) (1) 2(1)* 
(n - 1) (n - 2) + n«> 


(n* - 2n + 3) (4) 4(n 2 - 2n + 3) (3) (1) 3(2n - 3) (2)* 

* (n — !)<»> »«> n <4) 


6(2) (l) 1 3(1)* 

(n — lf <8) n (4> ' 


Methods of obtaining the expansion of this function in terms of power sums 
are presented in section 18. The corresponding function of the universe, if it 
were used, would be represented by H r . 


7. Other Moment Functions. It is possible to obtain an indefinite number of 
moment functions. For example one might define a function of weight 2 whose 
variance equals m, (or nl). It is possible by the methods of this paper to 
find expressions for such moments. 

For reference purposes Table I is provided showing the values of a for each 
partition of weight <6 for the functions m', m, l, h, k. The values of 

Cr pv ■■■?:) 

are also inserted, in the left hand column, so that it is possible to read from the 
table the values for / = m' r , m,, l,, fa when r < 6. 


8. Products of / Functions. The product of two or more isobaric functions 
is also isobaric and of weight equal to the sum of the weights of the functions. 
Thus 

ftfi = [a*(2) + a u (l)(l)][ai(l)] = o*oi(2)(l) + OiiCti(l) 2 
/*/! = fljffl*(2)(l) 4" auOi(l) • 

In multiplying / ri by /„ any term of is of weight n and when it is multi¬ 
plied by any term of weight r*, the result is a term of weight n + r s . 
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TABLE I 

Coefficients of Products of Power Sums in the Expansion of Different Moment 

Functions 
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TABLE I —Concluded 


Numeri¬ 

cal 

coeffi¬ 

cient 

a 


m r 

It 

hr 

h. 

10 

0*n 

0 

1 

n» 

B 

2(n + 2) 

(n - 1)< 4 > 

»* — 4» + 8 

„(*) 

15 

dm 

0 

0 


2(n — 1) 

(n - 1) (4) 

, (2« - 4) 

+ »<« 

10 

O 2111 

0 

-1 

-6 

6 

1 

n 4 

n A 

(n - 1) (4 > 

(ft - 1)< 4 > 

1 

Omu 


H 

■H 

%\ 8 

4 

ft<» 


R. A. Fisher [3, 207] used the product klk*t as an illustration of the algebraic 
method. The more general jif 2 gives 

/5/s = [oa(3) + 3o2i(2)(l) + Oin(l) 3 ] 2 [o2(2) + an(l)(l)] 

** tts02(3)(3)(2) + o?on(3)(3)(l)(l) + 60302102(3)(2) 2 (1) 

+ [ 603 O 21 O 11 + 2o3020iu](3)(2)(1 ) 8 + 9a2i02(2) 8 (l ) 2 + 2aaOiiiOn(3)(l ) 6 

+ [6021O111O2 + 9o^iOn](2) 2 (l) 4 + [6o2iOiiiOn + 020ni](2)(l) 6 + OinOu(l) 8 

which reduces to the value as given by him when the values of a are substituted 
from Table I. 


9. The Expected Value of Any Partition Product. The expected values of 
partition products are well known and are indicated by 

E(pi) = ny! vx 

E(pi)(p 2 ) = nn' Pl + Pi + n(n - l)n' Pl n' P2 
E(pi)(p*)(pz) = ftMpi+pg+Pi 4“ nfa 1) [Mpi+psMps 4* Mpi+p»Mpi 4" Mpj-fPsMpJ 

4“ n(ti 1) (n 2) MpiMpsMpi • 


and in general 


^ (vl l V • • • vl'\ 

bw*w • • • (p.r- = 2 nM x. x. x, ) (^ x, (^ x> ■ ■ • U.)*' 

\?1 ••• Qt / 

(pV vV ■ • • p. T ‘\ 

• • • + x< and I I indicates the 

W 1 qP •••?*'/ 


where r = xi + x» + Xs + 
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number of ways in which the partition pf 1 pj* • - • p'* can be grouped to 
form the partition qf l <?*’••• qV . 

The continued application of the result above leads to a large number of 
formulae. In order to make these results accessible I present in Table II the 
expected values of all partition products of weight S8. The essence of the 


table is the evaluation of the expression 


/prp,*’••• p.'-\ 
\qi l q **’ • • • qf 7 


The numbers 


at the top of each column indicate the subscripts of the p’s which must, of 
course, be multiplied by n M . The entries on the extreme left are the numerical 
coefficients associated with each row. 


10. The Expected Values of the / Functions. With the use of Table II one 
is able to write expressions for the expected values of f r when r < 9 . 

Mi(/i) = E(fi) = ciinni 

M [(.ft) - E(ft) = (a 2 + a n )nfit + a n n(n - l)p? 

Mi(/») = E(fs) — (a» + 3a*i + am )n/x» + 3(aii + am)n(n — 1)msMi 
+ amn(n — l)(n — 2)jtti* etc. 

If the expected values of the / functions are expressed in terms of the moments 
about the mean of the universe, these formulae become, since nl = 0 

m((/i) = o 

Mi(/*) = (02 + au)nMa 

Mi(/s) — (os + 3a*. + am)nnt 

Mi (/«) = (04 + 4aji + 3 om + 60in + o,mi)nm 

+ 3(ajj + 2am + anu)n(n — l)i4 etc. 
These may be written more symbolically as 
Ml(/l) = 0 
Ml (ft) ~ biTlm 
Mi(/s) = &*um» 

Hi(fi) = b t nn t + 36«n(n — 1 )mi etc. 

11 . The Expected Value of Products of / Functions. The expected value of 
products of / functions may be similarly found. For example 

m(/>) - E(fi) = EM 2) + a u (l) ! ] J = a\E( 2) J + 2o*a u E(2)(l)(l) + <&K(1) 4 . 
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Table II can now be used by indicating a\ as a multiplier of E(2) 2 } 2a*an as a 
multiplier of 2?(2)(1)(1) and ah as a multiplier of (1)\ Then at once it is 
evident that 

Ma(/t) = (a 2 + 2a%ai\ + Uii)n/i 4 + (a 2 + 2 a^an + 3an)n(n — 1 )m* 

* (a* + an) 2 ftM4 + [(&2 + an) 2 + 2a 2 i]ft(ft — 1 )m* 

® f>2^M4 + (6 2 + 26u)n(n — 1 )m*. 

Similarly 

mii(/s>/2) = bfanns + (6362 + 362 if >2 + 6621611)^(71 — 1)msM2 

M2(/s) = 6 2 n/i3 + (9621 + 663621)7^(71 — 1 )m4Ms + (63 4 * 9621)71(71 — 1 )ixl 
+ ( 96 2 i + 66111)71(71 — I) (71 — 2)/i2 

etc. 


where 63 = a 3 + 3 a 2 i + a m , 621 =021 + am, 6 m = am. The important 
special cases are obtained by assigning the proper values to the a’s as given 
in Table I. Thus 

^( 7712 ) = -4 [(n — l) 2 /i 4 + ( n 2 — 271 + 3) (ft — 1 )m*] 

71 

which agrees with the corrected result of “Student” in 1908 ( 8 , 3) and Tchou- 
proff ( 10 , 192). Similarly 

Hu(mi, rrh) = ((n - l ) 2 (n - 2)m + (n - 1) (n - 2) (n 2 - 5n + 10WJ 

nr 

nt(m 3 ) = [(n - l ) 2 (n - 2) 2 m« + (- 6 » + 15) (n - 1) (n - 2)Ws 

nr 

+ (n 2 - 2 In + 10 ) (n - 1 ) (n - 2 )V! + (9n 2 - 36n + 60) (n - 1 ) (n - 2 )/ 4 l 

etc. 


In the same way 


M = m + 

n 


Mu(fca, W = ~ 4” 
n 


(n 2 - 2n 4- 3 )m2 

ft(n — 1) 

(n 2 — 5 ft + 10 )m3M* 


Ht(ks) = ~ 4~ 

71 


(—6ft + 15 )m4M 2 
7 l(ft — 1 ) 


ft(ft — 1) 

(ft —2ft 4" 10 )ms 
71(71 — 1) 


(9ft 2 - 36ft + 60 )m2 

ft(ft — 1) (ft — 2) 


etc. 
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and 


M) -i Dm+ (#-1)41 

n 

mi) = - [ M + (n - 1 )m»w] 
n 

ni(mi) = - [m* + (n — 1)m»] 
n 

etc. 


12. The Expected Value of the Products of / Functions in Terms of the 
Thiele Moments of the Universe. The formulae giving the m’s in term if the 
X’s are 

M2 = X2 
Ma = Xa 
M4 = X4 + 3X* 

Ms = X* + IOX3X2 

Me = X0 -j* I5X4X1 -j- 10X8 4“ 15X2 


Mr = 



( x Pl r ( x ^ r -.- cx ^ 


where the summation holds for those partitions having no unit parts. See 
the results of Craig (2, 7-11) and Frisch (12, 21). It is at once possible to 
express the moment formulae in terms of the Thiele moments of the universe. 
Thus the general results above become 

Mi (ft) = 6371X4 + [3&*n + (6* + 26n)n(n — 1)]X* 

Mn(/s>/i) = 636371X3 + [10636371 + (6362 + 362162 + 662i6n)n(n — lJjXsXa 

Mi (jfa) = 6*nX« + [156*71 + (96*i + 663^21)71(71 — 1)]X<X2 
*4- [106*71 -f* (6* *4" 96*i)7i(n — 1)]X* 

+ [166371 *4” (276*i *4" 1863621)71.(71 — 1) + (9621 + 66*11)71(71 — 1)(ti — 2)]Xa. 


13. The Thiele Moments of the f s in terms of Thiele Moments. It is 

now possible to reduce to the Thiele moments of the /’s by means of the usual 
relations 

Mfr) = «(/,) ~ M(fr) 

^llC/rj 1 frt) = Mll(yrj t fr%) Mlo(/ri j /r|)M 0 l(/fi j /rj) 

Mfr) = Mfr) ~ 3 MfMfr) + 2 M(fr) 


etc. 
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so that the results become 

Xa(/a) * b\riki 4 2[b\n 4 b\iti(fi — 1)]X* 

Xn(/*,/«) = taftawXfi 4* {3[Ww 4 btibtn(n — 1)] + 6[6sb#n 4 6ai6nw(n —l)]}X*Xa 
Xa(/i) = bJnX® 4 {6[bfft 4 bJ>2ifi(jt — 1)] 4 9[6*w 4 b\in(ti — 1)]} X4X2 
4 9[6*w 4 — 1)]X8 4 {9[6j?i 4 26g6aiw(w — 1) 4 b\iti(n — 1) 4 

4 6[6fn 4 Zbhn(n — 1) 4 6mn(n — l)(n — 2)]}X| 

etc. 


The formulae as written are adapted to the partition representation of Part III. 
When the fs are equal to the m’s we have 

w«0 - fa - + *> + 2 fa - - - l)xi . 

n 8 n 2 

* m \ (n — 1 ) s (n — 2)X* 6(n — 1) (n — 2)Xj\s 
XnCm,, m») =- — - +- ~ t - 

X,(m 4 ) = (w - l) 2 (n - 2) 2 X» 9(n - l)(n - 2)*X«X 8 

4 n 6 n 4 

9(n - 1) (n - 2) S X? 6(n - 1) (« - 2)Xj 

+ n 4 + n* 


etc. 


which are the results as previously given by C. C. Craig (2, 55). In like manner 
when the f r = k r 

X4 , 2X2 


X 2 (*a) = — 4 


n n — 1 

n n —- 1 

MM = - + + 


6nX| 


n n — 1 n — 1 (n — 1) (n ~ 2) 
etc. 

as given by R. A. Fisher [3, 210] while 

Xj(toj) = -(X 4 + 2x1) 
n 

Xn(tfi*, nh) = -(Xfi 4 9X8 Xj) 
n 

\t(mz) * -(X 6 4 ISXiXa 4 9X* 4 15X|). 
n 


etc. 
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14. Various Formulization of Results. Although different moment functions 
of the universe may be used it is customary to express the results in terms of 
universe moments about a fixed point, in terms of universe moments, or in 
terms of universe Thiele moments. It is possible to express results in any of 
the nine forms 


AfrY 

n(f r ) / in terms of \ 

Hfr) j 


fmoments about a fixed point (//) 
I 

moments Ou) 

[Thiele moments (X) 


where f r represents the isobaric sample moment function of weight r. One 
purpose of such varied formulization is to discover the most compact form 
and also the one best adapted to use in the case of a normal universe or a uni¬ 
verse whose moments obey some discoverable law. As suggested above Craig 
(2) has shown the relative compactness obtained by using X(m r ) and Thiele 
moments of the universe while R. A. Fisher (3) has shown the great additional 
compactness obtained by taking f r = fc r . 


15. The Application of the Algebraic Method to X 2 i(/ 3 , / 2 ). Before leaving 
the algebraic method it is perhaps wise to outline the steps in the case of a 
more involved problem. We take the example which R. A. Fisher (3, 207) 
has used in the case in which f r = k r . To find X 2 i(/ 3 , f*). 

The value of f\f 2 was found in section 8. To find its expected value it is 
only necessary to enter the coefficients of the different partition products in 
this expansion at the left of the corresponding rows as indicated in Table II. 

The coefficient of any moment partition of the universe is found by multi¬ 
plying each column entry by its corresponding left row entry and then by 
multiplying by n (T) as indicated at the top. Thus the coefficient of ms is 

{d\d 2 4 “ (hail -f- 6a 3 a 21 a 2 4 ~ 6fl 3 a 2 i<Zii -f* 2 dsdmd 2 -j- 9 u 2 i<z 2 4 " 2 dsdmdn 4 ~ Gd^dzidm 

4 ~ QdmChidu + 6a 2 iainan + a 2 na 2 4 ~ (hnd\i)n 

which after some algebraic work reduces to 

(#3 4 ~ 3(^1 4 - 0 ui) 2 (<X2 4 " == b\b 2 u. 

In this manner it is possible to write the result either in terms of universe 
moments about a fixed point or in terms of universe moments. If moments 
are used, one may neglect all column partitions involving unity. 

It should be noted that the a’s defining fc r as given in Table I can be inserted 
here if desired. If these multipliers are introduced throughout the rows and 
columnar partitions involving unit parts are not used one will arrive at Table I 
of R. A. Fisher [3, 208] though there are some slight typographical errors in 
his rows for (3) 2 (l) 2 and (3) (2 2 ) (1). 

Determining all the coefficients in this manner we find after considerable 
algebraic manipulation that 
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Viiifztfi ) = blbtnut + [6162 + 962162 + 1262621611 + 66j62i62]w(n — l)/**^ 

+ [26*62 + I86I162 + I8621611 + 66362162 + 126362i6n]n(n — 1 )m&ms 

4 ~ [26*611 4 " 962162 4 ~ 186*1611 + 66 3 62i62]n(n — 1 )m 5 4 " [366*162 

4 " 5462 i 6 n + 66362162 4 - 1263621611 + 12636111611 + 726216111611 

4 - 1 8611162] Yi(fi — l)(ft — 2 )/* 4/4 4 ~ [6362 4 ” 66362162 + 1263621611 

4 " 276*162 + 906*1611 4 " 36621611162 4 ~ 726216111611 4 " 366 in 6 n]n(n — l)(n — 2 ) vim 

4“ [ 96*162 4 ~ 18621611 4 ~ 366216111611 4~ 66ni62 4 “ 366 in 6 n]n(n — l)(n — 2 )(n — 3 )/ 4 . 

If / r = k r the proper values of 6 are inserted and the expression above becomes 
that given by R. A. Fisher (3, 208). For example the coefficient of n$ is 

(9 n - 63n 2 + 240n - 420) (» - 3) 
n 2 (n - 1 ) 2 (w - 2 ) 


when 




1 

n(n — 1) 


621 = — 


_1_ 

n(n — 1)’ 


6111 


_2^ 

n(n — 1) (n — 2)‘ 


The algebraic results involved in changing the general formula above to 
other functions are too extended to present here. A symbolic means of attaining 
them is included in later sections of the paper. 


Part II, The Determination of Specific / Functions 

16. Functions Determined by the 6’s. In Part I it was shown how various / 
functions are defined by giving definite values to the coefficients of the power 
sums. It is the purpose of this part of the paper to show how functions can 
be specified by means of their expected values in terms of moments of the 
universe. This is essentially the method used by R. A. Fisher in defining his 
k function and it is here extended to other functions. In this case the 6*s are 
first determined and the a’s are then found from them. The first moments 
of /i, / 2 , /s were given in section 10. To these we add, as shown by Table II 

Mi(/i) — (&4 4* 4a 3 i 4- 3«22 4- 60211 4- Unn)ftM4 + ^( a 3 i + 3o*n 4* Oim)n(n — 1 )ms/u 

4 " 3(022 4 “ 2 o 2 ii 4 " aim)n( n ~~ 1 )M 2 2 4 ~ 6(0211 4 " Gim)n(n ~ l)(w — 2 )j* 2 Mi 2 

+ aj m n(n - l)(n - 2 )(n - 3 )mi 4 


etc. 
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These can be written more symbolically in terms of the b’s 
Mi(/i) =* hnn [ 

Mi(/s) ** bjnut + bun(n — 1 )mi* 

Mi(/b) = bzrtfjLz + 36nn(n — 1)m2Mi + bmn(n — l)(n — 2)mi 8 

Mi (ft) 5=5 &4^M4 + 46ain(n — 1 )msMi + 36«n(n — 1 )m* 2 + 6banW (8) MiMi* + 6n (4 Vi 4 , 

and in general 

m = S ( .. « , ) • • * ^G^rG*;.)'* • • • o 4 .) r -. 

\Pi Pt • • * p.7 

The expansion of the function in terms of the power sums of the sample demands 
the determination of the a’s. This can be accomplished by solving the equations 

Oi = bi 
dt + On = bi 
On = bn 

03 + 3021 + Om = b% 

O 21 + Om = 621 
O111 = 6111 

a* 4 * 4 oai + 3022 + 60211 + Oim = 64 

Osi + 30211 + Oim = 631 

O22 + 20211 + Oim = 622 


etc. 

The solutions are 

Oi = bi 

02 = 62 — bn 
On = bn 

03 = bz — 3621 + 2&in 
021 = 621 ~ bin 
Om = bin 

04 = b4 — 4 bsi — 3 b 22 + 12b2n — 6 bnn 

Osi = bz 1 — 3 b 2 n + 2 bnn 
022 = b22 — 2b2ll + bnn 
0211 = b2n — bnn 
Oun = bnn• 
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The values of a r , at least for r | 4 , follow the law 


" 2 . p ;.) ( “ ir ‘ ^ -» 1 b * - 


and 


(hi — •aaOi 1 where loaai 1 indicates that at = &2 — 6n is multiplied by ai = bi 9 
the rule of multiplication being suffixing of subscripts. Similarly 022 = * 02 a? * 
‘(b* - M (bt - bn) 1 - 622 - 26211 + 6 mi > 

This statement illustrates a general theorem which will be established later 
in another paper by a different approach that for all cases 

* - 2 (*.. , 1 ’ * 

and that 

a ri ... r< = { a Ti ar 2 • • • a r< >. 


This theorem enables one to write, with comparative ease, the coefficient of 
any product of power sums in a sample function whose expected values is defined. 
For example the functional coefficient of (3) ( 2 ) in / 5 is 

l a 8 02 1 = Kbs — 3621 + 26 iu) (62 — bn) 1 = 632 — 6311 — 36221 + 562m — 26 iim 

while that of (3)(l)(l)is aaaiOi 1 = 6311 — 362 m + 26mu- If the expected value 
of the function is known the 6 ’s are determined and the values of the above 
expressions can be found by substitution. 


17. The Values of the Fisher Moments (k functions). The k functions have 
been defined to be these functions whose expected values are the Thiele moments 
of the universe. Thus ju [(k r ) = A r and since 


= 2 ( „ * r ) (-1 r 1 (p -1) i G.;,)'V M r • • • o 4 .r*. 

\PiW ■■■ P >7 


it follows at once that by comparison with n[(Jr) in the last section, that 


(-l)'- l (p - 1) ! 

p ‘‘ = >) 


Thus 


bt m b, = 1 ; b n - 

n n 


1 , 

n (2) * 


r 1 . 1 1 . 1 . 2 

63 = n' 621 = JJ5P 6111 “Jiao’ 


b 4 = 


1 

n' 


631 = 


n (2); 


622 = 


-1 




( 2 ) 


6211 = 


n { 


( 8 ) ’ 


, “ 6 * 
61111 = ^» etc - 
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The insertion of these values in the formulae of section 16 gives the values of 
a such as those indicated in Table I and in section 5. Thus the coefficient of 
(3)<2)in/,is 

10 (&» - b»u - 3 b™ + 56 * m - 26 i,m) - - 10 [~ + pj + pj + ^ 


10 n l J 

(n - !)«>■ 


The coefficient of (3)(1)(1) is 


10(&*u - 3 b*iii + 26 um) -10 [-4+^1 + ^] = 7^” + *} . 

Ln (S) n (i) n (6) J (n — 1) (4) 


o(8) 1 *,( 4 ) ' *,(6) 


18. The h Functions. It is also possible to define a function whose expected 
value is the moment of the universe. Thus n[(h r ) = Mr where 

* = X ( 1 .) A * - • • • o.;.)'* 

W ••• P.7 


A p *i ... p *t — 


1 if s = 1, tti = 1, and pi = r. 

(—1) T * if Pi > 1, tti = 1, s =s 2 and p* = 1. 
(—l) r ~ 1 (r — 1) if pi = 1, 8 = 1, and u*i = r. 
10 in all other cases. 


Comparing with the value of Mi(/r) in section 16 we have 


bb*' ••• bl* = 


••• p** 


The substitution of these values of 6 in the results of section 16 gives the expan¬ 
sions of h r in terms of power sums as illustrated by the formulae of section 6 
and Table I. Thus the coefficient of (3)( 2 ) is 

10(632 — 6sn — 36221 + 562m — 26 iuii) 


10 _° + ^S) + 0 + + ^)_ = 


- 10 (n - 2 ) 
(n - 1 )«> • 


Similarly the coefficient of (3)(1)(1) in 65 is 


10(6,11 - 36,111 + 26 iim) = 10 p, + p + p» 


3 , 8 1 10^ - 4n + 8 ) 


19. The h' Functions. One line of attack calls for the introduction of new 
moment functions which will result in simpler formulae. Thus for example, 
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C. C. Craig wrote (2, 37) “It rather seems that the best hopes of effectively 
further simplifying the problem of sampling for statistical characteristics lie 
either in the discovery of a new kind of symmetric function of all the observa¬ 
tions which may be used to characterize frequency functions and which will 
be more amenable than either moments or semi-invariants for use in sampling 
problems, or in, what may very well prove to be much better and more 
feasible, the abandonment of the method of characterizing frequency functions 
by symmetric functions of all the observations altogether.” 

R. A. Fisher has shown that it is possible to introduce symmetric functions 
which do simplify the resulting formula appreciably. It is the purpose of this 
section to introduce an additional symmetric function which simplifies the 
resulting formulae to a much greater extent. It is admitted that this function 
does not have all the properties (such as invariance with respect to change of 
origin) possessed by the Thiele and Fisher functions, but it does not have the 
property of making the resulting formulae simple. It also has the advantage 
that n(h' r ) =■ n'(h' r ). 

The basic idea is to find a sample moment function whose expected value is 0. 
A first attempt, placing every 6 = 0, is of no avail since every a is also equal 
to 0 and there is no function. A second attempt is based on the idea of finding 
the function h whose expected value is If the universe is assumed to be 
measured about its mean, as is conventional, it follows at once that n[ = 0 
and = 0 so that 

1 y h r2 ) = Hn V (h Tx , h ri )‘ 


This function then has the property that its moments about a fixed point and 
its moments are identical. 

In order to discover its expansion in terms of power sums, we note 

M) « Mi' r 

and it follows at once by comparison with Mi(/r) in section 16 that b\r = -i- 

fl' r) 

and b p *i ... = 0 in all other cases. The a’s are determined in the usual 

way. Thus 


02 =s b* — bn 


1 

n(n — 1) 


dn = bn 


1 

n(n — 1) 


so that 


hi = 


1 


n(n — 1) 


[(2) - (1)(1)]. 





©WYBH 


fe ; - ~ 12 ( 3 ) - 3 ( 2 )( 1 ) + ( 1 )*] 

TV 


h'< ~ [0(4) - 8(3)(1) - 3(2)(2) + 6(2)(1)(1) - (1) 4 J 


and in general 




(-lr 1 [(pi - 1) it t(?>2 — i) !] ri • • • 


[(p. - 


1) !P 



In order to show the simple form in which results can be given we substitute 
the values of the V s in the results obtained above. Not only does ia[(K) = 0, 
but by section 11 


Xiih'i) = = M2 (^2) = M2 

Xll(/l 3 , ^2) = Mll (^3 7 W = Mll (^3 > A*) = 0 

M{hz) = M2(Aa) = utihz) = 


n(n — 1) (n — 2) 


3 

M2 


while from section 15 


\ (h f (h f i/\ ' (h f i/\ 36 MaM2 . 36(n — 3) M 2 

A 2 x(/i 3 , A 2 ) = m(n 3 , h 2 ) = M 21 U 3 , n 2 ) = ~in ~~—m + riTz-— 


n 2 (n — l) 2 (n — 2) n 2 (n — l) 2 (n — 2)* 


It is to be noticed that these formulae contain very few terms and that the 
terms themselves involve very low moments of the universe. This simplicity 
has been attained without making any assumption such as normality, regarding 
the nature of the universe. 


20. Table of Values of b for Different Functions When r < 6. This process 
of defining functions by means of expected values could be extended indefinitely. 
Perhaps it has been applied to enough functions to suggest the breadth of the 
applicability of the theory developed in Part I and Part III. 

As the Vs are the quantities which are used in the formulae I have provided 
Table III giving their values for the six functions, m! ri m r , i r , fcr, K , h' r when 
r == 1, 2, 3, 4, 5. When the a ’s are known, the Vs are computed from them 
according to the formulae of section 16. 
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TABLE III 

Values of the b’s for r ^ 5 
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Part HI. Combinatory Methods 

21. Partition Representation of Expected Value off Functions. The formulae 

Ml(/l) * &1«Ml 

Mi(/s) — biting + bnn(n — 1 )n[ i 

Mi (/*) = + 36*in(n — 1 )msMi + bmw(n — l)(n — 2)/ui* 

Hi (ft) = btUfn + 46»in(n — 1 )m 2 Mi + 3fe*jn(n — 1 )ms* 

+ 66jnn(n — l)(n — 2)ntH\ + &min <4 Vi 4 
are “synthetically” given by the column partitions 
1 

2 1 

1 

3 2 1 

1 1 

1 

4 3 2 2 1 

12 11 

1 1 
1 

The partition parts represent both the subscripts of the moments and the 
subscripts of the V s. If p indicates the number of parts, the n multiplier 
is n ip \ The numerical coefficient is obtained by taking the sum of the entries 
in the column (the weight) and dividing it by the factorials of all entries times 
the factorials of all repeated entries as indicated by 

\ _r!_ 

p, y (Pi O'* (?2 O'* • • • (p» O'* irj! 2 ! • • • JT, !’ 

The translation from the synthetic partition form to the expanded form is 
accelerated if the coefficients are known. These are provided in the following 
partition representation of the formula for pi(/ r ) when r 5 8 and the results 
are expressed in terms of the moments of the universe 

mi(/i): 0 

' 1 

2 

ul(f>): 1 

3 
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MxCfi): 




mi (/«): 




Mi (/it): 


1 3 

4 2 
2 

1 10 

5 3 
2 

1 15 10 15 

6 4 3 2 

2 3 2 

2 

1 21 35 105 

7 5 4 3 

2 3 2 

2 

1 28 66 35 

8 6 5 4 

2 3 4 


210 280 105 

4 3 2 

2 3 2 

2 2 2 

2 


The proper formula can be stated immediately from its synthetic representa¬ 
tion. Thus for example 

Mi(/«) = Mm e + 156 4 2n(n — 1 )m4M2 + IO& 33 M ~ 1)m3 

-f- 156222^(71 — 1 )(ti — 2 )mj. 


22. Partition Representation of the Expected Value of a Product of / Func¬ 
tions. Two column partitions may be used similarly to represent the expected 
values of the products of two/’s, three column partitions for.the expected value 
of the triple product, etc. In order to obtain all terms it is only necessary to 
combine every partition of each / in every possible way. The synthetic repre¬ 
sentation of E(mz, mi) is 

112 1 

21 20 11 10 
01 10 10 
01 

The sum of the entries in each row indicates the proper moment while the 
number of rows indicates the number of parts as in the preceding section. 
The n coefficient associated with a p rowed partition is then n (p) . The b coeffi¬ 
cient is indicated by the columnar entries. Thus 

Mn(/ 2 >/i = 6 jMms+ [Mi + 2bubi]n(n — 1)m*Mi + bnb\n(n — l)(n — 2 )mi 8 . 
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We verify this by the algebraic method 

Mn(/*,/0 - 2?{ M2) + a u (l)(l)]Ml)]} 

* E[<hai(2)(l) + auai(l) 8 ] 

« a*«i[nM 8 + n(n — IJmJmi 

+ aiMn/is + 3 n(n 1 )m 2 Mi + n(n — l)(n - 2)mi' # ] 

358 (flk + aiOdW/Lts + (02 + Ou)flin(n — 1 )m2Mi 

+ 2anai/i«Mi + 0nflift(n — l)(n — 2 )mi* 
= btbiiifi'i + btbin(n — + 26 n &iw(n — 1 )m2Mi 

+ b\\b\n(n — ‘1 )(n — 2 )mi # 

as indicated. 

It thus appears that the partition representation is a mnemonic device for 
indicating the solution as obtained by the algebraic method. A more formal 
justification is based upon the property that if 

E(ft) = b»( 2 ) + 6 n(l)(l) and E(U) = 6 i(l) 

then E(ft, fi) can be obtained by a symbolic multiplication of 62 ( 2 ) + 6 n(l)(l) 
by bi(l) where the b’s are multiplied but the power sums are collected in all 
possible ways. Thus 

*(/*,/») « M>i[(3) + (3)(1)] + fciM2(2)(l) + (l) 8 ] 

which gives 

E(fitfi) = bibinnz + btbin(n — l)mni + 2b\\b\n(n — 1 )m 2 Mi + bubin w ni 
as before. 

This symbolic multiplication is generally true and serves as the real algebraic 
justification of the partition representation. It will be established in a later 
paper dealing with the more general case of a finite population. The general 
type of partition analysis has been used previously by Fisher (3) and Georgescu 
(4). Each has established it through analytic rather than algebraic means. 

23. Determination of the Coefficients. Methods of determining the numerical 
coefficient have previously been given by such authors as Fisher (3), Wishart (5) 
(7) and Georgescu (4). If the/’s are of different weight, the coefficients of any 
partition (an interchange of rows is not looked upon as changing the partition) 
is given by writing in the numerator the factorials of the different r 9 s and in 
the denominator the factorials of all the different entries and the factorials of 
all repeated rows. Thus the coefficient of 


! 3! 2! 

! (1!) 7 2! ~ 


210 

111 is 
111 
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In case two or more functions have the same weight additional equivalent 
partitions are formed by interchange of columns. The reader is referred to the 
above papers for rules for determining the coefficients in the more involved 
cases though the coefficients are presented for all the two way partitions of the 
next section. 

An alternative method of finding the coefficients is that given by C. C. 
Craig (2, 24-25) since it appears that the symbolic formulae used in the present 
paper are essentially his formulae for v’s in terms of X's. For example his for¬ 
mula for v u (2, 22) is given symbolically by the formula for 44 in the next 
section. The only difference revealed is that the subscripts of the X's are read 
by rows rather than by columns and that they are sometimes interchanged. 
The more precise formulization is needed for the present interpretation although 
it is not needed for Prof. Craig’s purpose. 

A third method utilizes the symbolic multiplication process stated in sec¬ 
tion 22. Subscripts of the b’s are used to indicate which power sums are col- 
lected. Thus [6,(2) + 6„(1)(1)] 2 gives 

6&(4) + bM 2)(2) + 2[2fe,o5n(3)(l) + &W>on(2)(l)(l)J + 2b„b„(2)(2) 

+ 46no6ioi(2)(l)(l) + 6iioo6oon(l)(1)(!)(! ) 

where the underscored terras indicate the products given by [6,(2)] s , 2[ft*(2)] 
[&u(l)(l)L and [6n(l)(l)] s respectively. This'is represented by 


1 

1 

4 

2 

2 

4 

1 

22 

20 

21 

20 

11 

11 

10 


02 

01 

01 

11 

10 

10 




01 


01 

01 







01 


The underscored terms are the only ones remaining when n[ = 0. 

This method is especially useful when a large number of formulae are to be 
computed, as in the next section. 

24. The Partition Representation of Formulae of Total Weight g 8. The 

partition representation of when r g 8 are given in section 21. The 

partition representation of the remaining formulae of total weight ^ 8, which 
do not contain unit parts, are given below. 

22 112 
22 20 11 

02 11 

113 6 

32 30 12 21 

02 , 20 11 


32 
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42 

1 

1 

8 

6 

4 

6 

3 

12 







42 

40 

31 

22 

30 

21 

20 

20 








02 

11 

20 

12 

21 

20 

11 













02 

11 






33 

1 

6 

9 

1 

9 

9 

6 








33 

31 

22 

30 

21 

20 

11 









02 

11 

03 

12 

11 

11 













02 

11 







222 

1 

3 

12 

6 

4 

1 

6 

8 







222 

220 

211 

201 

111 

200 

200 

110 








002 

on 

021 

111 

020 

Oil 

Oil 












002 

Oil 

101 






52 

1 

1 

10 

10 

6 

10 

20 

10 

20 

16 

60 




52 

50 

41 

32 

40 

22 

31 

30 

30 

12 

21 





02 

11 

20 

12 

30 

21 

20 

11 

20 

20 











02 

11 

20 

11 



43 

1 

3 

12 

6 

1 

4 

12 

18 

12 

3 

18 

36 

36 


43 

41 

32 

23 

40 

13 

31 

22 

30 

03 

21 

12 

21 

*■ 


02 

11 

20 

03 

30 

12 

21 

11 

20 

20 

20 

11 










02 

20 

02 

11 

11 

322 ■ 

1 

2 

4 

12 

3 

1 

4 

6 

12 

12 





322 

320 

311 

221 

122 

022 

301 

220 

121 

211 






002 

Oil 

101 

200 

300 

021 

102 

201 

111 





1 

2 

6 

12 

12 

12 

24 

12 

24 






300 

300 

102 

021 

201 

111 

210 

120 

111 






020 

Oil 

020 

101 

020 

Oil 

101 

101 

101 






002 

Oil 

200 

200 

101 

200 

Oil 

101 

110 





62 

1 

1 

12 

16 

6 

30 

20 

16 

20 






62 

60 

51 

42 

50 

41 

32 

40 

31 







02 

11 

20 

12 

21 

30 

22 

31 






16 

30 

120 

46 

10 

60 

120 

90 


16 

90 




40 

40 

31 

22 

30 

30 

30 

21 


20 

20 




20 

11 

20 

20 

30 

12 

21 

21 


20 

20 




02 

11 

11 

20 

02 

20 

11 

20 


20 

11 













02 

11 
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3 

18 

10 

1 

15 

30 

10 

5 

30 
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51 

42 

33 

50 

41 

32 

23 

40 

31 




02 

11 

20 

03 

12 

21 

30 

13 

22 



15 

60 

90 

15 

30 

10 

30 

60 

90 

90 

45 

60 

40 

31 

22 

13 

31 

30 

30 

30 

12 

21 

20 

20 

11 

11 

20 

20 

20 

03 

21 

12 

21 

21 

20 

11 

02 

11 

11 

20 

02 

20 

02 

11 

20 

11 

11 

11 











02 

11 

1 

12 

16 

8 

48 

1 

16 

18 





44 

42 

33 

41 

32 

40 

31 

22 






02 

11 

03 

12 

04 

13 

22 


. 



6 

96 

36 

72 

48 

16 

72 

144 

9 

72 

24 


40 

31 

22 

22 

30 

30 

21 

21 

20 

20 

11 


02 

11 

20 

11 

12 

03 

21 

12 

20 

11 

11 


02 

02 

02 

11 

02 

11 

02 

11 

02 

11 

11 










02 

02 

11 


1 

2 

4 

16 

6 

4 

8 

4 

24 

16 



422 

420 

411 

321 

222 

401 

320 

122 

212 

311 




002 

011 

101 

200 

021 

102 

300 

210 

111 



1 

16 

6 

12 









400 

310 

220 

211 









022 

112 

202 

211 









1 

2 

16 

32 

12 

3 

24 

24 

48 

48 



400 

400 

310 

310 

202 

022 

211 

220 

211 

121 



020 

011 

110 

101 

200 

200 

200 

101 

101 

200 



002 

011 

002 

Oil 

020 

200 

Oil 

101 

110 

101 



8 

16 

12 

24 

12 

16 

48 

96 

24 

24 



300 

300 

210 

021 

120 

300 

201 

210 

111 

210 



120 

021 

210 

201 

102 

111 

120 

111 

111 

201 



002 

101 

002 

200 

200 

Oil 

101 

101 

200 

Oil 



3 

24 

6 

48 

24 








200 

200 

200 

200 

110 








200 

110 

200 

110 

110 








020 

110 

Oil 

101 

101 








002 

002 

Oil 

Oil 

101 
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1 

1 

9 

12 

e 

2 

18 

18 

6 

12 


332 

330 

222 

321 

312 

302 

212 

221 

320 

311 



002 

110 

Oil 

020 

030 

120 

Ill 

012 

021 


2 

9 

18 

6 








301 

220 

211 

310 








031 

112 

121 

022 








9 

18 

6 

12 

12 

18 

9 

72 

18 

36 


220 

220 

310 

301 

310 

202 

112 

211 

112 

211 


110 

101 

020 

020 

Oil 

110 

200 

110 

no 

101 


002 

Oil 

002 

Oil 

Oil 

020 

020 

Oil 

no 

020 


1 

'6 

12 

9 

18 

36 

36 

18 

36 

72 

36 

300 

300 

300 

210 

210 

210 

201 

201 

210 

210 

111 

030 

012 

021 

120 

102 

012 

111 

021 

101 

111 

111 

002 

020 

Oil 

002 

020 

110 

020 

110 

021 

Oil 

no 

0 

18 

36 

6 

36 







200 

200 

200 

110 

110 







110 

101 

110 

110 

110 







020 

Oil 

Oil 

110 

101 







002 

020 

Oil 

002 

Oil 







1 

4 

24 

24 

32 

3 

24 

8 




2222 

2220 

2211 

2201 

2111 

2200 

2011 

mi 





0002 

0011 

0021 

0111 

0022 

0211 

mi 




6 

12 

48 

96 

48 







2200 

2200 

2011 

2011 

1111 







0020 

0011 

0011 

0101 

1100 







0002 

0011 

0200 

0110 

0011 







24 

48 

96 

16 

48 

16 

32 





2001 

2010 

2100 

0111 

1011 

1011 

0111 





0201 

0201 

0111 

0111 

1110 

0111 

1101 





0020 

0011 

0011 

2000 

0101 

1100 

1010 






1 12 32 12 48 
2000 2000 2000 1100 1100 
0200 0200 0101 1100 0110 
0020 0011 0110 0011 0011 
0002 0011 0011 0011 1001 
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25. The Formulae for the Sample Moments about a Fixed Point in Terms 
of the Moments of die Universe. The partitions of section 21 and section 24 
can be immediately interpreted to give the formulae for the moments of the 
sample function. For example * 

Mii(/a,/2) = Ww-Mfi + (bib2 + 362162 + 6621611)71(71 — 1)M8M2 

and the value of M 21 C/ 3 , ft) as given in section 15 can be read by inspection. 
The value of the 6 ’s are to be inserted for any specific function. The coeffi¬ 
cient of M 2 in the expansion of Ms (/a) is 

(62 4“ 662611 4 - 8611 ) 71(71 — 1 )(t& — 2 ). 


In case fi = 62 = 


71—1 
" n 2 


and 6 u 


—r- so that the coefficient is 

7l 2 


(n — 1) (n — 2) (n 3 — 3ti 2 4- — 15) 

n h 


as indicated previously by Tchouproff ( 10 , 192) and Church (9, 82). 

The partitions of section 21 give the 8 formulae Mr,ov) which Tchouproff 
gave (10, 155). In this case f r = m T and every 6 is 0 except those having single 

subscripts and these equal 

n 

The partitions of section 21 give the formulae v r , (JV) which were given by 
Tchouproff ( 10 , 186). In this case it is only necessary to take f r = m T and to 
give the 6 's the proper values. Tchouproff has arranged his results according 
to decreasing powers of n. As an illustration we derive his result for v A , w = 
Mi(tw 4 ). From section 21 

Ml C/ 4 ) = 6 4 7lM4 4" 36227l(7l — 1 )mS 


and from Table II 


6 4 


( " - 1 )( .^.p 3 B ..t 3 > and 


so that 



= w + - ( 6 mI — (15/4 — 6 / 14 ) + A (9/4 — 3/i«) 

n rr n* 

as indicated by him. 

The partitions of section 24 also give formulae which have appeared before. 
For example the partitions 

112 

22 20 11 
02 11 
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which symbolise the formula 

r 

— b\n)n + (6» + 22>u)n(n — 1 )m» 

become 


[(» - 1)* + (»* - 2n + 3 )mI] 

tr 

which was early derived by “Student” (8, 3) and Tchouproff (10, 192), Simi¬ 
larly the partitions of 222 and 2222 give the formula for and ms(wi) and *u(m*) 
which were given by Tchouproff (10, 192-193) and Church (9, 82). 

Sections 21 and 24 can then be used to write the moments about a fixed 
point of a sample function in terms of the moments of the universe. In the 
case of new functions the b’a must first be determined. Formulae involving 
unit columnar partitions are not included. If the formulae were desired in 
terms of moments about a fixed point of the universe, it would be necessary 
to write in addition all possible partitions. See for example the last formula 
of section 23. 

26. The Formulae For Moments of Any Sample Function in Terms of Mo¬ 
ments of the Universe. The partitions of sections 21 and 24 are also useful in 
writing the formulae for the moments of the sample moments. It is necessary 
to make the usual adjustments in changing from moments about a fixed point 
to moments: 

M*(/r) = M*(/r) ~ Vl(fr) 

Mll(/ri j jfr a ) ** Mil(/rj»/rj) Mlo(/rx , /rj)M0l(/rj , fr t )- 

The particular two way partitions which are involved in this adjustment are 
immediately recognizable. They are the ones which have an entry which is 
the only entry in the row and in the column in which it is. Thus 3 gives 

220 

002 

one of the terms contributing to n'tift) MiC/i). In addition its coefficient is the 
same, if sign is not considered, as the coefficient of M 2 (/s) MiC f*) the expansion 
of Ma(/«) in terms of moments of ft . This has to be true since each is the number 
of ways of forming 220. And so in general the remaining function of n aceom- 
002 

panying this adjustment is the product of the coefficient associated with 22 
and that associated with 2. The sign is plus when odd numbers of moments 
are multiplied and minus when even numbers of moments are multiplied. 
Hence 3 contributes —3 rib\ to the adjustment to moments and the total 
220 
002 

contribution of 3 to the value of MsC/a) is 3 b\[n(n — 1) — n] = — 3bln. More 
220 
002 
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extensive study leads to the following general method of using the formulae of 
section 24. 

A. Write the coefficient of every two way partition according to section 25. 

B. Block off each single entry by drawing a line through its row and column. 
For example 

6 


The resulting partitions, 22, 2, 2 are called component parts. 

C. Form new partitions by eliminating component parts one at a time, two 
at a time, three at a time, etc. from the original partition in all possible ways. 

D. Form the coefficient of the resulting parts according to the methods of 
section 25. Multiply by (—1) ,_1 where s is the number of resulting parts. 
The values of 6 will not change. 

E. Multiply in addition by s — 1 when the component parts are all taken 
separately. 

6 

As an example we find the contribution of the partition 2200 to the value 

0020 

0002 

of Hiift). It gives 

66?[w(ra — l)(n — 2) — 3 n 2 (n — 1) + 2 ti 3 ]m4M,M2 = 12nb\nin\. 

Similarly 1 contributes 
2000 
0200 
0020 
0002 

6i[n w) - 4nn (S) + 6 n\n - 1) - 3n 4 ]*4 = 3 6?(n - 2)n\. 

We use the method in finding the coefficient of /4 in the expansion of 
We find first the coefficient of /4 in the expansion of It is indicated by 

the partitions 


1 

6 

8 

200 

200 

110 

020 

Oil 

Oil 

002 

011 

101 


so that the coefficient of )A is 

6,[n(n - l)(n - 2) - 3n 2 (n - 1) + 2n‘] + 66,6?,[n(n - l)(n - 2) - «*(» - 1)] 
+ 86i*n(n - l)(n - 2) = bl(2n) + 66,6?,(- 2n 2 + 2n) 

* + 86,?n(n — l)(n — 2). 
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When bt * * ~ 1 and bn * —^ this becomes —-■ .. Z .. 12n "h , as 

n 2 w 2 n® 

previously given by such authors as Tchouproff (10, 194), Church (9, 82), 

Carver (Richardson) (11, 271). 

The general Tchouproff-Church formulae for the third and fourth moments 
of the variance may be written out in this way as may many other moment 
formulae which have not been printed. 

27. The Thiele Moments of the Sample Function in Terms of the Moments 
of the Universe. It is possible also to write the Thiele moments of the sample 
function in terms of the moments of the universe. The technique is very 
similar to that of the previous section. The basis of the transformation is 
now the formula for Thiele moments in terms of moments about a fixed point 
rather than moments in terms of moments about a fixed point. The results 
are the same as those of the last section when a double or a triple product of 
fa is involved, but they differ with the introduction of a larger number of 
products. The partitions having component parts are broken up into these 
component parts as before but the parts are combined in all possible ways. 
Multipliers are determined as before with the exception that there is a multi¬ 
plication by (—l)*” 1 ^ — l)- f where s is the number of resultant parts. Thus the 

2000 

term 0200 contributes bi[n {4) - 4rai <3) — 3 n(n ~ l) 2 + 12n 3 (n — 1) — 6n 4 ]/4 = 

• 0020 
0002 

—6 b\niA to the value of X«(/i). 

28. The Moments About a Fixed Point of the Sample Function in Terms 
of the Thiele Moments of the Universe. We return to the problem of section 
25, only we wish to express the results in terms of the Thiele moments of 
the universe. We must use the formulae of section 12. 



where p% H 1. 

Thus Mr will contribute to all partitions of r and inversely the contributions 
to a given partition are composed only of these terms which are obtained by 
combining the different elements of the partition. Since the numerical coeffi¬ 
cient in the expansion of Mr is the number of ways in which the r units can 
be collected to form the partition, it follows at once that the complete X coeffi¬ 
cient can be obtained by grouping the parts of the partition in all possible 
ways, determining the coefficient of each according to the methods of section 25, 
and adding. In this way the formulae of section 21 can be used to give expan¬ 
sions in terms of partition moments. For example the representation of Mi(/e) 
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gives at once 

5gnX( -f- 15[6»n — l)]\4Xj -f- 10[&«n b^7i(ji — 1 )]xf 

15[6»n + Zbnn{n — 1) + bts£n(n — l)(?i — 2)]Xj. 

The partitions of section 21 can be made to give the formula n[(l T ) which 
were given by Thiele (1, 45-46). For example the formula for miCA) is indi¬ 
cated by 

1 3 

4 2 

2 


so that 


and since 


MiCfO = b A n \4 + 3 [b A n + iWi(n - 1)]X 2 


(n — 1) (ft 2 — 6ft + 6) 


and 622 


Mi'(U - 


(ft -1) (ft 2 ~ 6ft + 6)X 4 6 (ft - 1)X* 


which agrees with the result as given by him (1, 45). 

The two way partitions of section 24 can be used similarly. This device 
for changing to the X\s is due to the ingenuity of R. A. Fisher who applied it to 
the case where f r = k r . 

As an illustration we write from section 24 the value of MiCfa) in terms of X’s. 
The partition representation 


gives at once 

b\u \4 4~ [& 2 ft -f" b\n(n — 1 )]xl 4" 2[&2^ 4~ &nft(ft — 1 )]Al 

which agrees with the result of section 12. The other illustrations of that 
section may be written out similarly. 

As a final illustration of this technique we find the coefficient of \\ in the 
expansion of Mai(/s, /i). The partitions are 

2 9 18 6 

301 220 211 310 

031 112 121 022 
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and tiie coefficient is 

2[b\btn + b\b u n(n - 1)] + 9[&&» + bhl*n(n - 1)] 

+ 18[6*6jn + b\ibnn(n — 1)] + 6[bfbjn + bjbnbtn{n — 1)]. 


If the V s are inserted to form the h } s, the first and last terms become 0 and the 


others give 
(3, 208). 


27n - 45 

ft(ft — 1)*’ 


This agrees with the value as given by R. A. Fisher 


29. The Moments of die Sample Function in Terms of the Thiele Moments 
of the Universe. The partition representations of section 21 and section 24 
can be used similarly to write formulae for the moments of the sample function 
in terms of the Thiele moments of the universe. It is only necessary to use the 
general plan of section 26, but to write the coefficient of every resulting parti¬ 
tion according to the method of section 28. For example the partition 


gives the coefficient 



hi[n 4 4n <2) 4 3n (2) 4 6n (8) 4 ft (4) ] — 4 6j[n 2 4 3ft 2 (w — 1) + n 2 (n — l)(ft — 2)] 


4* 66*[n 8 4 ft 3 (n - 1)] - 3bW m, b 2 [n - 4ft 4 + 6ft 4 - 3ft 4 ] = 0 . 


30. The Thiele Moments of the Sample Function in Terms of the Thiele 
Moments of the Universe. The partition representations of section 21 and 
section 24 can also be interpreted to give the Thiele moments of the sample 
function in terms of the Thiele moments of the universe. The scheme is 
similar to that of section 29 except that the formulae for changing to Thiele 

2000 

moments are used as in section 27. For example the partition 0200 has now 

0020 

0002 

associated with it 

5*[w *4 4ft (2> -j- 3ft <2) 4 6ft* 8 * 4 ft* 4 *] — 46*[n 2 4 3ft 2 (ft — 1) 4 ft 2 (ft — l)(ft — 2)] 

— 3i>2ft 2 (ft ~ l) 2 4 126* [ft 8 4 ft 8 (ft — 1)] — 66*ft 4 = 0. 

The application of this method enables one to write the formulae of section 13 
(and pthers which they typify) with relative ease. It is now possible to com¬ 
plete the task left unfinished in section 15. We do not take the space necessary 
to write all the terms of X*i(/*,/*) since the lengthy expression can be obtained 
quite readily from the representation of section 24. One term, say the coeffi¬ 
cient of X«X*, is represented by 
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1 9 12 6 

330 222 321 312 

002 110 011 020 

and gives 

9[bs6jn ■+• b\ibtn(n — 1)] + 12[b|bjn + b&nbuniji — 1)] 

+ 6[i>*6*n + btbnbtn(n — 1)] 

2i 1 _1 

which becomes —-rr when 6» = b* = - and 6*i = bn = - -x. This 

n(n — 1) n »(» — 1) 

agrees with the result given by R. A. Fisher (3, 209). 

For simplicity of form it is logical to use this formulization of results, Thiele 
moments in terms of Thiele moments, and it has been used by Thiele (1), 
Craig (2), Fisher (3) and Georgescu (4). They however have used different 
sample moment functions. Thiele and Georgescu used the Thiele moments 
of the sample, Craig and Georgescu the moments while Fisher introduced the 
k function. 

The present discussion deals with the corresponding partition moments of 
any rational integral isobaric moment function of the sample. The results 
indicated here give many of the results of the previous authors as special cases. 
For example the symbolic formula 44 of section 24 gives the to 7 AjOi 4 ) of Thiele 
(1, 45), the Safa, »*) of Craig (2, 57), the <c(44) of R. A. Fisher (3, 210) as 
special cases when the formula 44 is given the interpretation of this section. 

Some may prefer the Craig attack (2, 21-35) to the partition method. It 
should be noted that the formulae of sections 21 and 24 can be used in place 
of part of the Craig method. Thus his formulae (2, 22) 

Pgo = Am + 28 AsoAw ~f- 56 A«oAao 4* etc. 

vu — A44 + (12 A42A02 + 16 A33A11) + etc. 

are immediately obtainable from the symbolic formulae by writing A’s in place 
of b’s and by using row, rather than column, subscripts. It is then necessary 
to compute the values of A*,*,... as given by him (2, 16-17, 40) and to insert 
in his expansions of Ski(» m , v n ) in terms of v’a. For example 

Su(?», vt) = - foo + (n — l>n — nvnovos] (2, 32) 

n 

and from the symbolic formulae of sections 21 and 24 
vm = Xw + IOX30X20 
V32 = X« + X30X02 + 3X12X20 4 * 6X21X11 
^30 = X30 
V20 ^20 
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Su(^s, ^ 2 ) = - [Xw + (w — l)Xa 2 + QXaoXao + (w — IX 6 X 21 X 11 + 3 X 21 X 30 )] (2,30) 

n 

which agrees with that given by Prof. Craig (aside from an obvious typographical 
error). The insertion of the values of X gives the value as indicated by 
Xu(w*, m2) of section 13 and by the first method of the present section. 

31. Special Rules for the Determination of the Coefficients in the Case of 
the Fisher and Georgescu Analyses. R. A. Fisher (3) gave a number of simple 
rules which assist greatly in the determination of the coefficients accompanying 
the partitions. Georgescu (4) also introduced special rules for the evaluation 
of the coefficients of the different partitions he used. It is not to be expected 
that all these rules are applicable in the more general case under present con¬ 
sideration, but the vanishing of such coefficients as that of 2000 leads one to 

0200 

0020 

0002 

suspect that there might be some rules which are applicable to this general 
case. A sensible method of procedure is to examine the rules of Fisher and 
Georgescu and determine if they hold in the more general analysis. The special 
rules of R. A. Fisher might be given somewhat as follows. 

A. If a partition has a column with a single entry, that column may be 
eliminated and the factor rT 1 introduced. 

B. Any partition having a row with a single entry may be neglected. 

C. “We may exclude any partition in which any set of rows is connected 
to its complementary set by a single column only.” 

D. In determining the algebraic coefficient of a partition the “pattern” is 
sufficient and precise entries are not needed. Thus the partitions 21 and 35, 

11 42 

although they have different numerical factors, have associated with them the 
same function of n. This value is indicated by the pattern xx which has asso- 

xx 

ciated with it the function —^. As a result of this property Fisher was able 

n — 1 

to provide a table (3, 223-226) of useful patterns which is of great assistance 
in writing the value of the coefficients. 

E. Formulae of moments of k functions involving k\ can be derived from 
corresponding formulae not involving k\. “The effect upon the corresponding 
formula of adding a new unit part to the partition is (1) to modify every 
term in the formula by increasing the suffix of one of its k functions by unity 
in every possible way, and (2) to divide the whole by n.” (3, 206). 

Two of the important Georgescu rules may be stated. 

A'. The numerator function (aside from numerical coefficient) is not altered 
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if columns are changed to rows and vice versa. Thus the coefficient of 4 in 
*S(3 s ) =* ~T%r^r an d the coefficient of sj in S(2 3 ) is Georgescu 

has replaced n by N + 1. 

B'. All partitions which can be broken up into component parts have coeffi¬ 
cients of 0. This is extended to include all partitions which have as component 
parts other partitions. Thus 

2100 

1100 

0012 

0034 

has a coefficient 0 as does the equivalent 

2010 

1010 

0102 

0304 

32. Special Rules for the Determination of the Coefficients in the More 
General Case. In the more general case we have 

A. If a partition has a single column with a single entry, c, that column 
may be eliminated and the value b c inserted as a multiplier. This is imme¬ 
diately evident since the contribution of that column to each term in the 
expansion is b c times its value if the column were eliminated. 

B. The coefficient of any partition having an entry which is the only entry 
in its row and column, is 0. 

This rule, which saves considerable labor in that it makes unnecessary the 
computation of the coefficients of many of the partitions of section 24, is estab¬ 
lished in this way. Without loss of generality the partition may be repre¬ 
sented by 


Cll 

Cl2 

CIS • 

• • Cu 

6 

di 

C22 

Cu • 

' * C%v 

0 

~ Csi 

Czi 

C33 • 

• • Cu 

0 

Cu 1 

Cm2 

Cm3 * 

’ * C U v 

0 

0 

0 

0 

0 

Cu-f 1 


and Ta * v may represent the partition containing the first u rows and the first v 
columns. We determine the coefficient of x u +i,*-u in terms of the coefficient 
of w u , v • Consider first any grouping of the u rows of r u , * into w rows. There 
will be w corresponding groupings of r u +i , v +i in which the last row is added, in 
turn, to each of the w rows and another w + 1 rowed term in which it is not 
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added. In each of the first w cases the coefficient by rule A is multiplied by 
In the case of the w + 1 rowed partition the coefficient is multi¬ 
plied by 6^+,.,+,, and n M is replaced by A final adjustment takes 

care of the transition from the moment about a fixed point of the sample 
-function to the.Thiele moment of the sample function. This adjustment de¬ 
mands the multiplication of the coefficient of t u .« by be.*.,.,+, n and the sub¬ 
traction from the sum of the other terms. If B v is the coefficient of the to 
rowed form, it follows at once that the corresponding coefficient is 

B»b*. 1( , +l [ton (tt> + n lw+1) - nn (B) ] = 0. 

This holds for the expansion of any term of t u , » and hence the coefficient of 
*v+-i,»+i is 0. Of course the argument holds if the partition has more than 2 
component parts. 

It thus appears that this rule holds not only for k r and m r as Fisher and 
Georgescu have noted, but for f r . 

C. The coefficient of any partition which can be broken into component 
parts is 0. In this sense a component part is any group of rows or columns 
which have no entry in common with any other group of rows or columns. 
It corresponds in matrix language to a matrix which results when one matrix 
is zero bordered by another matrix although rows and columns may thereafter 
be interchanged. 

The proof of this more general case follows the general line of the simpler 
case although the reasoning is more complicated. For example the coefficient of 


Cll Ci2 • • • Ci» 0 0 

Csi Ca • • ■ Civ 0 0 

Csi C 32 ... csv 0 0 

Cu\ Cu2 * * * CuV 0 0 


0 0 • • • 0 c u4 1 , t)*f 1 Cu-j-11 w+2 
0 0 •••0 C«+2, t»+2 

is 0 since any w rowed term of the 7r w ,« contributes 

BfJ)e u +i , «+i . *+2+c«+s , «+j l wn + n —.nn ] 

+ B v b . M+1 . . , +1 b Cu+l , ,+,c«+,. Mw - 1) n M + 2wn (w+1) + n ( " + ® 

- n(n - 1) n (w) ] = 0. 

Other special rules of Fisher and Georgescu do not hold in the general case. 
Thus Fisher rule B is not generally true since the partitions 


12 and 22 
30 20 
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have respective algebraic coefficients of bjbin + 6n&*«(n — 1) and 

bJnn + 6j*6jn(n — 1) 

and these are not in general equal to 0. 

The Usher rule C is replaced by the somewhat less general C of the present 
section. 

The Fisher rule D is not applicable in the general case. The Fisher rule D 
is applicable in all cases in which the value of the b p *i ... is completely deter¬ 
mined by the number of parts for in this case the particular value of each 
part is not pertinent. We may say then that the Fisher rule D is applicable 
to all cases in which is a function of p, n where p is the number 

of parts. This condition is satisfied by b P *i = -— — ^ and the 

coefficients are worked out for it in Fisher’s paper. The same method is 
applicable to other functions satisfying the general condition although the 
values of the coefficients will of course vary with the definition of b. 

The Fisher rule E is not applicable to the general case. Its validity, from 
an algebraic standpoint, depends upon the Fisher property B which is not 
generally applicable. The Fisher rule E as applied to the more general case 
gives correct terms but it does not give all the terms. For example the Fisher 
rule E applied to X*(&») gives 

n n — 1 

The application of a corresponding rule to 

Xs(/s) — 6*71X4 -f* 2[b\n -f- 6un(n — 1)]X» 

would give 

Xii(/2,/i) — 6*6inX 5 -J- 4[6*6i7i -4- bnbin{n — l)]XsXs 
while the correct result is indicated by 

14 2 4 

221 210 201 111 
011 020 110 


and is 


X«(/*/i) = 6*6inXs •+■ 4[6|6in *f- 6j6n6in(w —1)]XjXj + 2[6*6iw + b%bin(n 1 ) ]X*X* 
+ 4[6j6i« + 6iib : n(n — 1)]XjXj. 
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The difference is due to the vanishing of the two middle terms in the case of 
the k functions. 

The rule B', which Georgescu found most useful in computing and checking 
his formulae, is not generally true. It is not even true in the case of the k 
function, as can be discovered by using it on the list given by R. A. Fisher 
(3, 210). It is interesting to note that the Georgescu method, while not being 
able to utilise many of the special rules of the Fisher method, does use this rule 
which is not in general adaptable to the Fisher method. 

33. Special Rules in the Case of the h f Functions. Special rules can be 
worked out for other sample functions. As an illustration we examine the 

function h! r which was defined in section 19. It is recalled that bip = and 

n (p) 

that b p * x i... = 0 for all other cases. It follows at once that 

A. Any partition having any entry other than unity (or zero) may be 
neglected. 

B. The value of b\p is — 

n {p) 

As an illustration we write the value hi). From the partitions of 

section 24 we select 


36 


36 

111 


110 

111 

and 

110 

110 


101 



Oil 


as being the only partitions making a contribution. The result of section 19 
follows at once. 

34. The Case of a Normal Universe. A normal universe is characterized by 
the relationship that X r = 0 when r > 2. It follows that it is only necessary 
to compute the coefficients of those partitions giving powers of X 2 . 

Wishart (5) (7) has developed the partition analysis of the k function in 
the case of a normal parent while Georgescu has studied the corresponding 
tn function. It is not the purpose of this section to make extensive study of 
the case of the normal parent but simply to indicate that .the results of section 24 
are immediately applicable. As an illustration we write the values of Xi(/ 2 ), 
X 2 C/ 2 ), X$(/ 2 ) and X 4 (jf 2 ) in the case of a normal universe. The terms are given 
successively, by 


1 

2 

8 

48 

2 

11 

110 

1100 


11 

Oil 

0110 



101 

0011 




1001 
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and hence 
Xi(/s) = 6*nX» 

Xs(/t) = 2[b\n + bhn(n — l)]x| 

Xj(/t) =* 8(6271 -|- 3626*171(71 — 1 ) H~ b g n n(n — 1 )(ti — 2 )]X* 

X 4 C/O = 48(6271 + 6b\b\\n(n — 1) + 6 }i 7 i(ti — 1) + 4 626 ^ 171(71 — 1)(ti — 2) 

+ 26}i7i(ti — 1)(ti — 2) + 6}in(n — 1 )(ti — 2)(ti — 3)]Xj. 

It is only necessary to substitute the 6’s to obtain the results for different values 
of /. This is done in Table IV. 

TABLE IV 


The first four Thiele moments of ft for various sample functions in the case of a 

normal universe 


Sample 

func¬ 

tion 

Xi(/i) 

Mft) 

*«(/*) 

M/.) 

m 2 

(n - '> X, 
n 

2 (n — 1) x 2 

0 * 2 

n L 

8(ti - 1) j, 

773 X * 

48(ti - 1) Xl 
n 4 

h 

X 2 

2Xl 

n — 1 

8Xl 

(n - l) 2 

48 X 2 

(n - 1)» 

k 

<B - ’> X, 
n 

2(71 - 1 ) 2 

, A 2 

n* 

8(n - 1)X| 
n 8 

48(t7 — l)Xl 
n 4 

m '2 

X 2 

2Xl 

n 

8 Xl 
n 3 

48Xl 

n 4 

h 

X 2 

2Xl 

n -r 1 

8 Xl 

(n - 1 ) 2 

48Xl 

(71 - 1)* 

h' 

n 

2Xl 

8(71 - 2 )x! 

48(71* - 3n + 3)x! 

n >2 

u 

k..... 

n(n — 1 ) 

n 2 (n — l ) 2 

n z (n — l) 8 


One surmises that the general value of 

X.C/i) is 2 r\r - 1)! \IB: 11000 • • • 0 

01100 ...0 
00110 ...0 

00000 •••11 
10000 ••• 01 
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where B represents the b coefficient of the r rowed partition. This induction 
appears consistent with the fact that 


^r+l(kt) 


2 r rlx; +l 
(n - l) r 


as shown by John Wishart (7). The whole subject of the Thiele moments of 
the general function in the case of a normal universe would make an interesting 
subject of investigation. 


35. Summary and Conclusion. The contributions of this paper include 

1. The definitions of specific moment functions in terms of power sums. 

2. The use of indeterminate multipliers in representing a general isobaric 
moment function. 

3. The finding of the expected value of products of these functions by alge¬ 
braic methods. 

4. The use of tables in writing these expected values in terms of moments 
(or of moments about a fixed point) of the universe. 

5. The finding of the expected values of specific moment functions by sub¬ 
stitution. 

6. Means of establishing the expansion of new moment functions which are 
defined by their expected values. 

. 7. The introduction of the sample function of weight r whose expected 
value is n T . 

8. The introduction of the sample function of weight r whose expected 
value is n[ r . 

9. The two way partition formulae of weight ^ 8 which do not involve 
unit parts. 

The use of these partition formulae in writing: 

10. The moments about a fixed point of f, in terms of moments. 

11. The moments of f T in terms of moments. 

12. The Thiele moments of f, in terms of moments. 

13. The moments about a fixed point of f r in terms of Thiele moments. 

14. The moments of f r in terms of Thiele moments. 

15. The Thiele moments of /, in terms of Thiele moments. 

16. Special rules in the case of Thiele moments. 

17. The applicability of these results to a given sample moment function 
and hence the derivation of varied results, of such authors as Thiele, Tchouproff, 
Church, Fisher, Craig, and Georgescu, from the same partition formulae. 

18. The simplicity of the formulae when h' r is used as the sample function. 

19. The application of the synthetic formulae to the Craig method. 

20. The applicability of the theory to a normal universe. 

The introduction of such general procedure opens up a wide field for future 
study. It is impossible in a single paper dealing with so broad a subject to do 
more than to outline the general scheme by which two way partitions can be 
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used as a central formulization of the various formulae for moments of moments. 
More detailed proofs and more extensive analysis of the more important of the 
special cases will undoubtedly be supplied by later writers. 

In later papers the author will show how the partition representation can 
be used in the case of multivariate distributions and how it can also be used, 
in connection with the sampling polynomials introduced by H. C. Carver (11), 
to represent the more complex formulae obtained in the case of finite sampling. 

It is obvious that the author is indebted to the classical moment studies of 
Fisher and Craig. He also wishes to acknowledge his indebtedness to Prof. 
Craig and to Prof. Carver who have read the manuscript and have made 
valuable suggestions. 

The University of Michigan. 
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NOTES 


A COEFFICIENT OF CORRELATION BETWEEN SCHOLARSHIP 

AND SALARIES 

INTRODUCTION 

Some might doubt that it is correct to apply a coefficient of correlation to 
show the relationship between scholarship and salaries. This coefficient can 
be trusted to give at least a rough approximation, which is all that is necessary 
in the inexact science of vocation. It is fictitious accuracy to be too finical 
in the application of formulas. Therefore, a coefficient of correlation between 
scholarship and salaries is a valuable part of human knowledge. 

Would it be worth while to find this coefficient if it is based upon the experi¬ 
ence of the American Telegraph and Telephone Company? Since the employ¬ 
ment practices of this company are not representative of the employment 
practices of business at large, one might doubt the validity of drawing general 
conclusions from such specialized data. The coefficient for business at large 
, is probably less than the coefficient for the Bell System; the value of this knowl¬ 
edge is enhanced if we know the latter coefficient. Since this company is very 
large, a coefficient between scholarship and salaries would be valuable, even if 
this coefficient applies only to the Bell System and to other companies having 
approximately the same employment practices. 

An article 1 by Mr. Walter S. Gifford, President of the Bell System, contains 
a discussion of some of the relationships between scholarship and salaries. 
President Gifford, however, did not determine in the case of the Bell System a 
coefficient of correlation between scholarship and salaries. 

The purpose of this article is not a new contribution to statistical method, 
but is an application of the method 2 of finding the coefficient of correlation 
when the two variables have not been quantitatively measured. This method 
will be applied to the chart on page 672 of President Gifford's article, in order 
to determine for the Bell System the coefficient of correlation between scholar¬ 
ship and salaries. 


FINDING THE COEFFICIENT OF CORRELATION 

An explanation of the chart. It is based on the experience of 2,144 Bell 
System employees over five years out of college. First, assume these employees 

1 It is entitled “Does Business Want Scholars?” and was printed in the May 1928 issue 
of Harper’s Magazine. 

* It can be found in Elderton’s “Frequency Curves and Correlation.” 
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are grouped according to their grades in college. In the high scholarship group 
put those who graduated in the highest third of their classes. The middle 
and low scholarship groups are formed in like manner. Secondly, suppose the 
same employees are divided into three equal groups according to their salaries. 
Then, the salary of any one of the employees would be high, middle, or low. 

Assume a hypothetical group of 300 employees who are college graduates. 
Suppose that the scholarship of 100 of them was high, that the scholarship of 
100 of them was middle, and that the scholarship of the others was low. Also 
assume that the salary experience of these 300 employees is the same as that 
of the 2,144 employees of the Bell System. 

The 300 employees can be grouped according to the following table. 


TABLE NO. 1 


Salary 

Scholarship 

Totals 

Low 

Middle 

High 

High. 

22 

24 

48 

94 

Middle. 

31 

39 

27 

97 

Low. 

47 

37 

25 

109 

Totals. 

100 

100 

100 

300 


This table can be combined as follows. 


TABLE NO. 2 



Scholarship 

Salary 




Low & Middle 

High 

High 

C 

d 

Middle & Low 

a I 

b 


Then, c = 46, a = 154, d = 48, and b = 52. Assume N = 300. 

Assume x is a function of grades received in college. Suppose y is a function 
of salaries received. Assume that the frequencies x and y both follow the 
normal curve of error whose standard deviation is equal to one. Also assume 
that the average of x and the average of y are both equal to zero. It is a 
matter of common knowledge that salaries are not arranged in a symmetrical 
fashion; y is not a linear function of salaries. 

In the formulas which follow, r is the symbol for the coefficient of correlation. 
These formulas are applied to Table No. 2. We have 

l — r e~ ix * dx = (q + c) ~J b + d ) - = -167, and h « .4316. 

V2t Jo 2JV 
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Also 

— f k c"* 1 '* dy = (a - + - a~, (c — - .187, and* = .4874. 

x/fr Jo * 2AT 

Then, 


H = 


1_ e -H« 




.3635, and iff 


1 „-»** 

V2* 


.3643. 


All the quantities except r in the following approximate equation are known: 


+ h ~ 3)fc(jfc * - 3 ) + - «k* + 3) (* 4 - 6ft* + 3). 


125 


Therefore, 

.0261/ + .0681/ + .1034/ + .1052/ + r - .4314 = 0. 

# 

Then, r is approximately equal to .4051. Consequently, for practical purposes 
we can assume that r — .4. 

28 Boody Street John L. Roberts 

* Brunswick, Maine 


NOTE ON THE DERIVATION OF THE MULTIPLE CORRELATION 

COEFFICIENT 


Consider N observed values of each of n variables. These n-N values may 
be tabulated in a double-entry table as follows: 

X xl Xn Xn • • • Z iy 

X 2 i X22 X23 • • • Xis 


XmXrt XnZ 


■ ■ X 


njV 


where Xne is the fc th value of the i th variable. 

Using the i >h variable as the dependent variable, the general linear relation¬ 
ship between the n variables may be expressed by 

Xi = ittl XI + iOt Xt + • • • + idi-l Xi-I + id i+ i Xi+t + •■■ + idnXn ( 1 ) 

where 

,-a,- is the general parameter which is to be determined empirically; 
x, = X,- - Mr, 

Mi is the arithmetic mean of the j th variable. 
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By the method of least squares, the constants of (1) must satisfy the normal 
equations: 

(2*J)<at + (SxiXi )<04 + • • • + 

+ {2xix i+1 )<o, +l + • • • + (2xix n )<a„ = 2xiX< 
(2x*Xi ),-ai + (2xi),a3 + • • • + (2xjx,_j)<a<_i 

4- (2xjx< + i),-o, + i + • • • + (2xjx„)<a„ = 2xjx< 


(2x j _ 1 xi)<a 1 + (2x < _iXj),-a 2 + • • • + (2x j _ 1 x n ),a„ = 2x < _ 1 x< 
(2x,- + jXi),-ai + (2Xj + iXj),-a« + - ■ • + (2x,-+ix„)ja„ = 2x<+ix< 


(2x„xi),ai + (2x„Xs)<Oi + • • ■ + (2x*)<a„ = 2x„x< 

where 

('Zxad = E (X« - Af.) (X# - Af,). 

lt-1 

But 

( 2XiX j) = iV" 7*» j<T iff y , 

(2x?) = JV<r? = NraViCi (2) 

where 

ft, is the Pearsonian coefficient of correlation between the i ih and j th variables, 
<r,, the standard deviation of the z th variable. 

Substituting the right members of (2) in the normal equations, we obtain 
the system: 

n 

T\kff\ffk idle = 0 

fc-1 
n 

2 TikG&k idle = 0 

53 *V-1, * »<** = o 


53 r *+i»* 0 * 4 * a fc = o 

Jb—1 


53 & * a * = o 

A~1 


( 3 ) 
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where 

Let 


.■a. = — 1. 


| TnO\Oi ■ ■ • r n i<r n <ri 

A = 

t fln&lffn TnnGnGn 


(4) 


A\j be the first minor of the element in A, nA be A with the t lh and & th 

columns interchanged, and be the first minor of the element in the t th 
column and i th row of uA. 

Solving (3) for by Cramer’s rule, we find 


But it can easily be proved that 

<kA„ - (-1 y~ k+l A ik ; 


hence 


iO>k — ( 1) 


i-k+l Aik 

An' 


Using cofactors of A instead of minors, we have 


<a k = (-1 )*'" t+1 


(-iy +k D ik 

Da 


Djk 

A/ 


Without writing the determinant out in full, we notice that the a f 9 can be 
factored out. Hence 


where 


*a* = — 


2 2 2 2 2 2 2 tt 

<T\<T% • • • (Tk-lVk0k+\ * * * 0’i-lG’tO‘i+l • • • a n Kik 

<r\o\ • • • • • • <j n Ku 


(TiKik 

CTkKii 


K = 



* 7*ln 


( 5 ) 


Tnl Tnn 
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Using these derived values for the coefficients, we may write (1) in the sym¬ 
metric form: 


— (Xi - Mi) + (X t - Mi) + ■ • • + — (X» - M n ) = 0, 

<T 1 <T 2 <r n 


or 


KijXj 


0. 




For a multiple correlation coefficient, we use the formula 

r Xij f iOjkXkj “f" ^ iQ'k%kj\ ”| 

t>2 t j-1 L \*-l / J 

* 1 -T7“2- 

iV (7-i 


( 6 ) 


which measures the amount of observed dispersion from the regression plane 
in which Xi is the dependent variable. 

Substituting the values for the a’s, we find 


= 1 — 


AT y 


. Ki 2 X2j 
H-r 


cr 2 


+ 


KinXnA 2 

/ 


Ka N 


Squaring the bracket expression and using (2), we obtain 

N JV / „ „ ^ 1 

/2i = 1 — 


= 1 - 

The second sum is the sum of the products of the elements in the fc th row 
by the cofactors of the elements in the i th row. This sum is necessarily zero 
unless k = i; but if k = i, this sum is equal to K. 

R* = 1 — ~ (KuK) = 1 - 

Ka /Vt * 

Oregon State Agriculture College William J. Kirkham 

School of Science 
Corvallis, Oregon 


_i_ sp y / KikKu 2^ XkjXij 

L'Z Z-J Zj . -L_ 

,*- 1 '- 1 \ Na k <n 

It 

Ka L*-i /-i J 
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NOTE ON NUMERICAL EVALUATION OF DOUBLE SERIES 1 


1. The Euler-Maclaurin summation formula has been extended to two 
variables by Dr. Sheppard, 2 * and Mr. Irwin, 8 to determine cubature formulas. 
A more complicated two-dimensional form was given by Baten 4 * involving 
product polynomials, for which a remainder term was also calculated. The 
purpose of this note is to apply the simpler formula to the numerical evaluation 
of double series of positive terms. The method may be extended to multiple 
series of order p > 2. If the double series converges one may sum by rows 
(or columns), using the ordinary sum formula twice. The method is to take 
out a rectangular block of mn terms and then apply the formula to the remaining 
terms. By taking m and n sufficiently large one may cause the series resulting 
from the formula to converge sufficiently rapidly to obtain the sum to the 
desired number of decimal places. For practical work the error may be es¬ 
timated because of the asymptotic character of the series involved in the Euler- 
Maclaurin formula. 

Write this in the form 


(i) 


2*> - I'fW* + t/W - i/(s) - ~ 

f\a)-f(s) ,/vn D /*-«(«), 

» lonncnn •••“TV— l) °r “T 


30240 1 1209600 ] v 7 r (2r)! 

If s —► oo one has accordingly in the ordinary case of convergence 


(2) t /<*) - /'/<*» + i/(«) - £ W + Or - ms, +' • • • 

oo r oo 

Now define v(x) = y) = I u(x, 




V) dy + lu(x, b ) - + 


12 


“*720^ — • • • and w(y) = u(x, y) = J u(x, y)dx -f $u(a, y) - + 


«.»(<*, y ) 

720 


« « a—1 {,-1 00 6—1 

• • •, then 2 52 w(*, 2/) = 12 52 2/) + 52 *>(*) + 52 «>(2/) 

x—•! y—1 x—1 y—1 x-»l y—1 


(3) 


v(x)dx + ^ + V -^ 


-jf 

+ ^ - §«>(&) + 4»(l) 


V f-t\ o—l 6—1 

30240 + • • • + § 52 n(,x, y) 


+ 


w'(b)-w'(l) w"'(b) - w"'(l) 


12 


720 


+ 


1 Presented to the Society, Nov. 30, 1934. 

* W. F. Sheppard, “Some Quadrature Formulae,” Proc. London Math. Soc., Vol. xxxii, 
1900. 

* J. O. Irwin, “Tracts for Computers,” No. X, Cambridge Univ. Press, 1923, On Quad¬ 
rature and Cubature. 

4 W. D. Baten, “A Remainder for the Euler-Maclaurin summation formula in two 

independent variables,” Amer. Journal of Math., Vol. 64, 1932, pp. 265-276. 
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a—1 b—1 jo a—1 

Instead of this one may use £ w(x, 2/) + £ w(l/) + S *>(*)• The scheme 

* x—l y—1 y—1 **»1 

of the double series may be illustrated by a sketch of a quadrant of the xy-plane 
in which the point (x, y) represents the term u(x, y). 

Evidently by taking a combination of results from (3) one may evaluate 

quite readily such finite sums as 2 «(*, y) where q and t are large. 

X*mp ymmr 

As an illustration of (3) consider X) L (** + Here one needs to 

evaluate the integral of the summand. The transformations % — ay tan 6 
and y — \/t lead to a form which may be integrated by parts. The more 
complicated form 2 L (<w 2 4- 2 bxy + cy s ) - * for the case in which s > 3/2, 
a > 1, might be handled by.using x — l/t and approximate integration by 
Simpson’s rule. 

Take as a second example V*. T"! (x 4- y)~”, p > 2. The case of p — 4 was 
carried out by taking a = b = 10 in (3) and carrying the computation to twelve 
decimals. The series involved converge rapidly and a result was obtained 
which differed by 2 in the 12th place from the true value 0.119 733 669 448 + . 

ep 

By summing diagonally one may convert this to the simple series 2 z(z + 1)~ 4 

i 

CO 00 

or 23 (s — l) s ~ 4 = ( 8 3 — s ~ 4 )- The method of summation diagonally may 

2 1 

be extended to ^ E (^ + at/)” p , P > 2, a > 0, by the applications of the 
Euler-Maclaurin sum formulas (1), (2) in succession after a triangular array 
of terms have been omitted. 

The form T) YLx~ v y~ 9 can be written as the product of the single series 

C L**)(Lv -). 

2. Another method of numerical evaluation is the analog of that used for 
single series by the author. 5 Instead of rectangles one has right prisms of 
square or rectangular cross-section. Instead of shifting the rectangles one unit 
to the right to determine upper and lower bounds the prisms are shifted diago¬ 
nally so that they go effectively one unit in each variable. In the case of a square 
base each prism is moved along the 45° line one diagonal unit length. For 
the lower bound instead of trapezoids one uses truncated prisms. For example, 
the prism of height u mn is cut by two planes, one determined by the upper 
vertices u mn , w«+i, n and the other by the upper vertices Um+i.n, u m ,n+ i, 

tim+i. n+i of the truncated prism. The surface z = u(m, n) passes through 
all the upper corners of the truncated prisms. Each prism is composed of 
two truncated triangular prisms. Now the volume of such a triangular prism is 
the arithmetic mean of its vertical edges multiplied by the area of its base. 


4 “A New Method for Finding the Numerical Sum of an Infinite Series,” Amer. Math. 
Monthly, vol. XL, No. 9, Nov., 1933, pp. 537-542. 
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Hence the difference in volume between the truncated rectangular prism men¬ 
tioned above and the prism of uniform hight 2 = u mn can be shown to be 

(4) (5u„n - «*+l,n+l - 2w m+ i,n — 2u m , n+l )/6. 

Let us consider series whose corresponding surfaces do not rise above these 
truncated prisms. This sort of truncated prism differs less from the volume 
under the surface than the one formed by the diagonal joining the other pair 
of upper vertices and planes through it for upper faces. The lower bound for 
the remainder is the volume under the surface extending to infinity in the 
m and n directions plus the sum of these differences. Accordingly one deter¬ 
mines as the lower bound for the remainder R m ~i, n-i after summing a rec- 

m—1 n—1 

tangular array 2 X the form 

t-i i-i 

<*> m 

(2/Um, 1 + 2Wl t n 5t/n»,n)/6 -j- ^ ^ Ilm+i, 1 "4" ^ ^ Wl,n+j *4" ^ ^ > U\ t n 

(5) ^ 7W 

V 7 n r*> roo roo r m 

4" i "4” / / Um,n dlTldfl -|- / / Um,n dfudfl ^ Rm-~ l,n—1* 

/—I ,/n yi 

The upper bound may likewise be given as follows: 

/•oo roo r 00 00 ~| 

(6) Rm-i.n -1 < S + T + I / u m ,ndmdn — A; £ 

Jn—l Jm —1 L;—»n—1 t—m J 


where 


00 n—1 

w.7, 

i—m —1 




(8) ft = 


roo roo 

/ / 

y n ~i >-i 


roo roo oo oo 

I I llm,n dlfldfl ~~ ^ 

J» ,/m j—» — 1 »—m 


1,n—1 *4” ^w.n—1 


An alternate definition of k is 


C — ^ Um ,n dlTtfitl 


An illustration is afforded by (w + 1) 4 for which & = .45614, 

n—1 m— 1 

ft' = .44536 when m = n = 10 in (8), (9). In this case (5) gave an error of 
— 14 X 10~* and (6) an error of 10 -8 . 

S and T may be evaluated by the method published in the Monthly. 8 
One must assume that ft increases with m and n. It is evident that for this 


• Loc. cit. 
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« 

method and for the one in the Monthly differentiability is not required but 
only integrability, conditions less restrictive than those required by the Euler- 
Maclaurin summation formulas.. It is also clear that the method may be 
extended to multiple series of positive terms of multiplicity greater than two. 

Department of Mathematics Chester C. Camp 

University of Nebraska 
Lincoln, Nebraska 



REPORT OF THE ANNUAL MEETING OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 

The meeting of the Institute of Mathematical Statistics for 1936 was held in 
Chicago on December 28-30 in connection with the meetings of the American 
Statistical Association and the Econometric Society. 

In addition to the sessions at which voluntary papers were read, a session with 
invited papers was held on the morning of December 30. At the invitation of 
the Program Committee, Professor P. R. Rider presented a paper on “Recent 
Advances in Mathematical Statistics: Factorial Design” and Professor Harold 
Hotelling spoke on “The Analysis of Sets of Correlated Variates.” 

Professor C. C. Craig of the University of Michigan and Professor A. R. Cra- 
thome of the University of Illinois constituted the Program Committee. 

At the business meeting of the Institute, the following officers were elected 
for the year 1937: President, Dr. W. A. Shewhart; Vice-Presidents, Professors 
P. R. Rider and B. H. Camp; Secretary-Treasurer, Professor A. T. Craig. 

The Institute voted that it would presumably hold its 1937 meeting with the 
American Mathematical Society. 

Allen T. Craig, 
Secretary . 


NOTICE TO SUBSCRIBERS 

Plans are under way to include in the Annals a new section, entitled “Numer¬ 
ical Illustrations of Statistical Methodology.” This new’ section will be a 
regular feature of the Annals, and will deal with the application of statistical 
technique and theory to the solution of problems in various fields. It is hoped 
that this new section will be of considerable value to those who are primarily 
interested in numerical applications of the more recent theoretical developments 
in mathematical statistics. 

The Editor will welcome contributions to this new section of the Annals. 
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REGRESSION AND CORRELATION EVALUATED BT 
A METHOD OF PARTIAL SUMS 

Bt Felix Bernstein 

“To be sure, Laplace viewed the matter in a similar way but he selected the 
absolute value of the error as a measure of loss. But if we mistake not, this 
position is certainly not less arbitrary than our own; that is to say, whether the 
double error is to be considered just as tolerable as, or worse than, the simple 
error twice repeated and whether it is thus more fitting to ascribe to the double 
error only a double weight, or a greater one, is a question which is neither in 
itself clear nor determinable by mathematical proof but has to be left entirely 
to individual discretion. 

“Furthermore, it cannot be denied that the assumption under discussion 
violates the principle of continuity and precisely for this reason the procedure 
based on it strongly defies analytic treatment while the results to which our 
principle leads have the advantage of simplicity as well as of generality.”— 

F. G. Gauss: Theoria combinationis observationum , pars prior , art. 6. 

Since the “Theoria Combinationis” of C. F. Gauss appeared in the year 1821 
a century of Mathematical Statistics has been dominated by the ideas of this 
classical treatise—ideas whose fertility does not seem to be exhausted even 
today. 

The germ of most modern contributions to mathematical statistics—in fact 
also those of Karl Pearson and his school—go back decidedly to this paper. 
Though the immediate achievements of Gauss are so conspicuous as not to 
need any comment, a true critical appreciation of the work can be gained only 
by comparing it with the previous methods of Laplace, superseded by those of 
Gauss. 

For such critical appreciation, C. F. Gauss himself has prepared the ground 
in the lines quoted at the beginning of this article. To Gauss the standard 
deviation is a measure of uncertainty or risk of a game in which the errors of 
observation are considered as causing only losses. In this he follows the lead 
of his great predecessor. The difference between them is that Gauss adopts 
the square of the error as a measure of the loss while Laplace adopts its absolute 
value for this purpose. Either choice frees the error from its sign so that the 
loss is the same regardless of the sign of the error. 

Gauss considers this choice of the measure of the loss as purely conventional. 
Therefore he feels justified in adopting the square of the error because in adopt¬ 
ing the square instead of the absolute value of the error, the mathematics he 
uses remains in the easily accessible domain of analytical processes. This 
creates for these methods a superiority in elegance, simplicity, and generality. 

The modern developments of mathematical statistics, based on the principles 
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6f Gauss, have confirmed the correctness of this viewpoint. This has proved 
true particularly in the theory of analysis of variance developed by R. A. Fisher 
and in the more general theory of semi-invariants, first defined by N. H. Thiele. 

The inadequacy of the Gaussian method seriously impairing its value for 
statistical use has come to light through the investigations of Karl Pearson of 
distributions of one and two variables. Since the moments of higher order 
involve standard deviations of increasing magnitude the characterization of the 
distributions by means of the moments, in line with the Gauss-Thiele concepts, 
becomes practically impossible. Therefore it was of the greatest interest that 
Lindeberg was able to derive an expression for the standard deviation of a 
measure of skewness constructed not on Gaussian but on Laplacian lines, 
namely based exclusively upon the sign of the error. The mathematical diffi¬ 
culties surmounted by Lindeberg by a very involved and difficult analysis— 
with some clearly indicated gaps in the proofs—are precisely of the character 
of those that Gauss wished to avoid. Encouraged by the success of Lindeberg, 
I have developed in two papers 1 the standard deviations of more general mo¬ 
ments and the correlations between them of which the mean deviation of Laplace 
and Lindeberg’s measure of skewness are special cases. The proofs have been 
arrived at by a rather simple and rigorous procedure. These new moments, 
together with the old ones, form a new system of statistical characteristics by 
which a distribution in one or two variables can be described by expressions 
of lower order and therefore of greater precision. This method makes un¬ 
necessary the use of moments of higher order than the third. 

But another point of interest is still involved. It has been assumed that the 
Gaussian characteristics give a greater amount of information than those of 
Laplace. This is proved, however, only for the case of the normal distribution 

e~ h ' xt . This was recognized by Gauss himself in his paper of April, 1816, 

that appeared five years earlier than the Theoria Combinationis Observationum. 
In article 6 of his paper, he says, that the constant A of a normal distribution 
obtained from one hundred observations by the use of the standard error is 
as exact as that obtained from one hundred fourteen observations in which 
the mean deviation is used. Hence with a given number of observations only 
the equivalent of 88% of the total are used by the second method. This does 
not hold true for all distributions. The following theorem can easily be proved: 
The amount of information as defined above, furnished by the use of the mean 
deviation is greater, equal to, or less than that furnished by the standard devi¬ 
ation, depending respectively upon whether 

1 Felix Bernstein: “Die mittleren Fehlerquadrate und Korrelationen der Potenzmo- 
mente und ihre Anwendung auf Funktionen der Potenzmomente,” Matron, Vol. X, N. 3, 
(Nov. 1932). 

Felix Bernstein: “Uber den mittleren Fehler der Potenzmomente.” Zeitschr. f. d. ges. 
Vers.-Wissenschaft, Bard 30, Heft 3, March 1930. 
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(fc-l)'i 4(/3o — 1) 

where 


00 


w 

0 s 


0 ! - 

Mi 


w, the /c-th moment and 0 


For example, in the distribution ^ e A| 

JL 


the mean deviation. 

, the mean deviaition furnishes a greater 

amount of information than the standard deviation. 2 

In the present paper, we shall discuss the practical use of expressions for 
correlation and regression in which the new type of statistics formed along 
Laplacian lines will be used. These new expressions are of a linear form and 
can be computed therefore more easily than those of Karl Pearson. The amount 
of information given by these expressions is less than that given by the expres¬ 
sions of Pearson if the normal law, in two variables, is fulfilled. For other 
distributions, however, this is not generally true. The determination of the 
standard deviations of these new expressions is given in Metron. 8 

The application of the new expressions of regression and correlation to grouped 
data is set forth here for the first time. The method is strongly recommended 
for all cases in which the data lose reliability with increasing deviations from 
the mean. Deviations in the new method enter the expressions only in the 
first degree and not in the second as in the case of Pearson's. It is obvious 
that the influence of the doubtful extreme readings is, therefore, considerably 
lessened. Since our expressions are linear, no adjustments for grouping (Shep¬ 
pard's corrections) are necessary. 

It ought to be mentioned here that linear expressions for the measurement 
of correlation have been set up before. 

K. Pearson (Biometrika) and Egon Pearson (Biometrika) have derived an 
expression called “linear correlation ratio" which in case of linear regression is 
identical with the correlation coefficient. 

K. Pearson also discusses the linear correlation coefficient 


-K 


xsgx 


xsgy \ 

vmr 


* To this second type of distribution curves also belongs y ■ 
of two Gaussian curves with the same origin, i.e. \f/(x) — | 
1.6 < k < 3.4. 


\f/(x) where x(x) is the mean 
-»».* _|_ — e -H*A***\ 

V* / 


& 


I owe this remark and some other valuable suggestions regarding the subject of this 
paper to Mr. Myron Fuchs. 

1 Op. cit. 
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suggested by Lena and various other linear expressions, all similar to our expres¬ 
sion (1). He finds that they are. all equal to his quadratic correlation coefficient 
in the case of a Gaussian distribution. 

However, their expressions were not recommended by those authors for the 
determination of correlation between quantitative variables, because— 

1. No easy and practicable methods were given for their evaluation in the 
case of grouped data. 

2. Their standard deviations were not determined. 

We now proceed to define the new formulas and to describe the methods for 
their evaluation. The proofs are furnished in the Appendix to this paper. 


Let ri and ri denote the regression coefficients of x on y and y on x respectively, 
and r, as usual, the coefficient of correlation, and by x and § the arithmetic 
means of the x’a and y’a. Let us take x, y as the origin, so that x, y are the 
deviations from the mean. We have 


n 

( 1 ) 

rt 


r = 


Sx 

±y 

Sy 

+y 

Sy 

±z 

Sx 

+x 


or ri 


or r 2 


Sx 

-y 


Sy 

-y 

Sy 

—x 

Sx 

—x 


Sx denotes a partial sum of the x’s, this sum being extended over all the x’s 

-\-y 

of the observations whose y is positive and the other sums have a corresponding 
meaning. 

It should be noted though that if data occur whose ^-deviation is 0 (practically 
never in a grouped table) one-half of the sum of these x‘s should be added to Sx. 

+ 2 / 

In the S a similar addition should be made in case observations occur in which x 
+x 

is zero. (See Table IV.) 

The formulas (1) and all following ones will be proved in the appendix to this 
article. 4 


4 Using r\ and r% of (1) the regression lines are y « r s x and x * ri y. They are those 
straight lines which fit the data best according to the method of least squares, if the weight 
of the deviations is taken inversely proportional to the absolute value of the variable. 
Taking x for instance as the independent variable, r 2 is the value of m which minimizes 


S y (y — wia:)? (the sum extended over all data x y). 
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The standard deviations of r t and rt are 


, (1 + m(m — 2 r)) where m = 


= ~~ (1 + n(n - 2r)) where n = 


We are now going to illustrate the computation of r and for this purpose 
we shall use a table of Pearson's which gives the correlation between the heights 
of fathers arid daughters. 

The totals at the right and lower end of the table are first computed and 
the bracketed numbers are the sums of the numbere that precede. The 


means are 


_ _ 1659.5 — 1179 480.5 

* ~ 1376 ” ~i376 


. 1650.9 - 1390 260.5 

V ~ ~ 1376 ~ '1376 

whose signs determine on which side of the working mean to “quarter” the 
table. This quartering is done in Table 1 by the lines w and hh. Then the 
totals above the heavy horizontal separating line hh and those to the left of 
the vertical separating line w are found, e.g. 2, 4.5, 7.25, • • * and .5, .5, 0, • • • . 
Multiplying these totals by the respective class marks, we find the outside lines: 
18, 36, 50.75, • • • and 5.5, 5, 0, • • • . 

Sx is now = 1107.5 — 420.5 = 687, and an adjustment for the fact that a 

-y 

working mean has been used has yet to be made. This adjustment is xN 
where N-y is the number of negative y’s . (N~ v == 728.) 

We have therefore for the adjusted values 


« 1107.5 - 420.5 + 


•728 = 825.07 


= 1179 + 11^-728 = 1433.21 


n = .5757 


r t = .5170 


r = .546 


The standard deviations, according to the formulas (2) are 

<Tfj ^ .031 (Tr, = .027 



Correlation between Heights of Fathers and Daughters 
x Height of Fathers y l Height of Daughters 
In Inches 



Working Mean z = 67.5 
Class width 1 Inch 
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The standard deviation of r* = n X r s has to be estimated by using the 
general formula for the standard deviation of the product c of two variables 
a and b; 



fa , <r\ 
a 2 fr 2 


2 Rffaffb 

ab 


R being the correlation coefficient between a and b. Since — 1 < 22 < + 1, 
substitution of these limits for 22 leads to the inequalities 



putting a = rj, b = rs, c — r* we have 


ri r 2 r ri r 2 

Considering the relation <r r = 

we have 2r (cr ri r 2 — o- r| n) < a r < 2r (cr ri r 2 + ov, n) 
from which we derive with sufficient approximation 


(J r < -030 


A slightly different arrangement for computing r has been made in the 
following table. 


TABLE II 


Correlation between diameter of the stem and length of the lonest flower petal of 

Trientalis enropaea* 



PS 

3 

15 

34 

45 

30 

6 

2 

0 

0 

0 

0 


PS 


— 4 

—3 

— 2 

-1 

0 

1 

2 

3 

4 

5 

6 

Total 

1 

-4 

i 











i 

7 

-3 

i 

4 

i 

i 








7 

29 

-2 

i 

9 

16 

3 

i 







30 

33 

-1 


2 

9 

22 

9 

2 

i 





45 

27 

0 



8 

19 

20 

4 

i 





52 

8 

1 

i 



7 

18 

12 

6 

4 




48 

1 

2 




1 

8 

9 

3 

2 

i 



24 


3 






3 

6 

4 

i 



14 


4 







2 

2 

i 

2 


7 


5 









i 

3 


4 


6 









i 


i 

2 

Total 

4 

15 

34 

53 

56 

30 

19 

12 

5 

5 

i 

234 


*E. Czuber: Die statistischen Forschungsmethoden, Wien, 1921. 
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TABLE III 

x = Diameter of the stem. 

y = Length of the longest flower petal in millimeters. 
Working mean, x m = .826, y m = 34.6. 

Class width of x = .4 mm. of y = 6 mm. 


X 

Total 
times x 

P.S. 
times x 

y 

Total 
times y 

P.S. 
times y 

—4 

16 

12 

-4 

4 

4 

-3 

45 

45 

-3 

21 

21 

-2 

68 

68 

-2 

60 

58 

-1 

53 

45 

-1 

45 

33 

0 

(182) 

(170) 

0 

(130) 

(116) 

1 

30 

6 

1 

48 

8 

2 

38 

4 

2 

48 

2 

3 

36 

0 

3 

42 


4 

20 

0 

4 

28 


5 

25 

0 

5 

20 


6 

6 

0 

6 

12 


Mean 

(155) 

-27 

(10) 


(198) 

+68 

(10) 


The P.S. columns are the partial sums as explained in the previous table. 
The work of multiplying the totals by the class marks and of adding them has 
been separated here from the table. 

We obtain N = 234, N- x = 106, N- v = 135 


n 


rt 


97 

170 -iO-^X 135 

130 + Si X 135 


116 - 10 + H X 106 

182 -g^X 106 


.805 


.834 


r = .82 


Pearson's coefficient for this table is r = .83. 

Finally we illustrate by a small non-grouped table where the partial sums 
can be written down immediately. 
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TABLE IV 

Correlation between Ages of Husband and Wife 


Age of 

Age of 

Deviation 

Deviation 

Husband 

Wife 

Husband 

Wife 

22 

18 

-8 

-8 

24 

20 

-6 

-6 

26 

20 

-4 

-6 

26 

24 

-4 

-2 

27 

22 

-3 

-4 

27 

24 

-3 

-2 

28 

27 

-2 

+1 

28 

24 

-2 

-2 

29 

21 

-1 

-5 

30 

25 

0 

-1 

30 

29 

0 

+3 

30 

32 

0 

+6 

31 

27 

+1 

+ 1 

32 

27 

+2 

+ 1 

33 

30 

+3 

+4 

34 

27 

+4 

+ 1 

35 

30 

+5 

+4 

35 

31 

+5 

+5 

36 

30 

+6 

+4 

37 

32 

+7 

+6 

Ave 30 

26 




Here O-deviations occur in the third column. Hence 6 

Sy = 26 + l X 8 = 30, Sx = 33, Sx = 31, Sy = 36, 
+x +x +y +y 

n = .86, r* = .91, r = .88 (Pearson’s r = .86) 

Appendix 

Proof of formula (1), page 1. The following notations will be used: 

(f(x))° = probable value of f(x) 

(f(y))l = probable value of f(y) for a fixed x. 

x + 1 , > 

sgx — sign of x = for x 9 * 0. sgx = 0 if x ^ 0. 

1*1 _i 


* See page 7. 
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Tb® assumption of linear regression means that 
(4) Vl-V» = r V :.(x - x 0 ) 

We multiply both sides of (4) by some arbitrary function of x and get 
(vl - y # )4»(*) = r V :»(x - X°)4>(x). 

Both sides are functions of x. We shall take their probable values for all x’s. 

Now, for a fixed x, yl<l>(x) = (y<t>(x))l and the probable value of (y</>(x))« for 
all x’s is equal to the total probable value (y<f>(x))°. So we have 

(yf»(z))° ~ (l U(x))° = r y: ,((x - x a )4>(x))° 

( 6 ) r = «y - 

w v:i «* - *)*(*»• 


If now we take x°y° as the origin, we get 


7*v:* — 7 ~ 


(y<*>(*))° 

(x*(x))° 


and similarly 


r x:y == 


(x*h(y)) a 


(y<t>i(y))° 

where fa is another arbitrary function. 

Replacing the probable values by the respective arithmetic means we get 


Sy<t>(x) 
Sx$(x ) 


Sx<tn(y) 

Sy<h(y) 


with y as the origin. 

By a suitable choice of the still arbitrary functions <t> and fa , we may derive 
ail the various expressions for regression coefficients. Taking, for instance, 
<l>(x) = x, fa(y) = y , we get Pearson's expressions. Taking <t>(x) = sg(x — ai), 
fa(y) = — as), and 0:2 being constants, we have 

_ Sy ag(x - «Q _ Sx sg(y - qQ 

Sx sg(x - «,) ’ KW Sy sg(y - a 2 ) 

and if we make <*i = a* = 0 

ra\ . Sy sy x _ _ Sx sy y 

\Q/ fy:* o-j ^*:y — c?- 

Sxsgx’ Sysgy 

Since Sx ^ Sy ^ 0, we can add Sy or Sx to the numerators and denominators. 
Adding Sy to the numerator, Sx to the denominator and multiplying both 
sides of the fraction by \ we get 

_ _ jSy(«g(x - at) + 1) 

W vxx ~ *Sx(sy(x - ai ) + 1) 
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Instead of (9) we can write 


( 10 ) 


Tv:x — 


s y+ is y 

x > an x = ai 
S x + JS i 
* > ai x = ai 


since the operations of (9) multiply the y ordinates by 0, 1 according as the 

x’s are | ai. 

The expression (10), with a suitable choice of <*i should be used for the purpose 
of numerical calculation of r. For instance, when calculating r from the data 
of Table IV, we took «i = «j = 0 and had 

Sy + § S y 
+x x = 0 
Sx 
+x 

When dealing with data which are arranged in a grouped table (Tables I 
and II) we take ai equal to the x-ordinate of that classline which is nearest to 

the mean. ^In Table I ^ = .5 - With that choice of ai the sums 

S disappear and the sums S are equivalent to the corresponding sums 


Tv:x — 


x *= ai 

S. Hence we have 
+x 


(ID 


Tyix — 


Sy 

+£ 


x > ai 


and similarly 


Sx 

+x 

Instead of (9) we can also write 

$Sy(sg(x - ai) - 1) 


Sx 

+y 

Sy 

+y 


(9a) 

This leads to 

(11a) 


Ty'.z — 


iSx(sg(x - ai) - 1) 


r viX = 


Sy 

— x 
~Sx 

— x 


and 


—- 


Sx 

-y 


Sy 

-y 


6 It is desirable to chose the absolute values of the a ’s small so that the maximum number 
of data enter into the calculation of r. However, to take ai * at =» 0 would necessitate a 
division of the middle arrays of a grouped table, a laborious process. Hence the choice 
of the o’s as described above. 
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Proof of the standard deviations of Formula (2). 

In my article on standard deviations and correlations of moments 7 the stand¬ 
ard deviations of the expressions used in this article have been derived. 

In the following, the notation of the Metron article just referred to will be 
used. We use the symbols: 

Pm.» = 

P/m,« == 2 x m 8gxy n 
Pm,/n = ^2 x m y n 8gy 

P/m,/n * 2J ® &g%y &gy 

The summations indicated extend over all observations. The true or prob¬ 
able values of the same expressions are indicated by using p instead of P. 

_ _ Pl/O 

r x:v — Ti — 

* 0/1 

We derive the standard deviations by defining the deviations as first variations. 

log n = log Pi/o - log Po/i 


! ]° = ( r i) 2 [(■ 


The probable values of the terms on the right hand side of the last equation are 
derived on pages 17-19 and listed on pages 32-33 of the Metron article referred 
to. The proofs which imply essentially a process of variation of Stieltje’s 
integrals will not be given here. From pages 32-33 we take 


KiPu «)T = 




so that 


[(P i/o<5Po/i]° = ——- 


- Pi/oPon 
N 


2 1 { '\2 r^o . Po? 2pn ~| 

*ri = lirVll „« + — _ _ 

N LPi/° Po/i pi/opo/ij 


Assuming Gaussian distribution, we can put 


«■ * 

P » 0 ~ 2 Pno 


it J 

pot = gPo/l 


Pa = r VPwPio = T |p/ioPo/i 


T Felix Bernstein: “Die mittleren Fehlerquadrate und Korrelationen der Potenzmo- 
mente und ihre Anwendung auf Funktionen der Potenzmomente,” Metron, Vol. X, N. 3 
Nov. 193?). 
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Hence 

(15) - rlfrofi + ^-ar®®) 

M 2 \ Pl/0 Pl/0/ 

Replacing the theoretical values by their corresponding empirical values, 
we have 

(16) ov, = ^ (1 + m 2 — 2rm) where m = ^ x - ~ - x _ 

l 2N Sxagy 

The formula for ovj has been derived here for the value of n as given by (8) 

i.e. n = ®LS». In fact, we used n = ^ in the examples in the 

Syagy Sy sg (y - o) 

article, and a had some value absolutely smaller than .5. To use equation (16) 

for the standard deviation of n is within the limits of the required degree of 

accuracy; hence we shall disregard the difference. In a later paper the standard 

deviation of r\ for any a will be derived by using the method described in the 

Metron article, for a different purpose. 

To prove the statement in the footnote to page 7 

To find the value of r* that makes 

Sf(x) (y — r»z) s a minimum. 

By differentiating we get 

Sf(x)(y — r 2 x) x = 0 

Sxf(x)y 
* Sxf(x) x 

If f(x) = 1 we get Pearson’s coefficient. 

If f(x) = i (x ^ 0) we get 
FI 

Syjgjc 
Sx sgx 


n = 


„ x 
Sr-j y 
FT 

« X 

s Wl x 


New York University, 

Departments of Anatomy of the Graduate School and the College of Dentistry. 



METHODS OF OBTAINING PROBABILITY DISTRIBUTIONS 1 

By Burton H. Camp 


The emphasis of this paper will be on method. Special results will be cited 
in order to illustrate the methods rather than to summarize achievement in the 
field; for that has been done already by Rider (1930, 1935) Irwin (1935) and 
Shewhart (1933) in recent surveys. The purpose is to describe and to illustrate 
most of the methods that have been used to determine exact probability dis¬ 
tributions, and to show that they are all derivable from one fundamental theorem. 
In order to prove this unity in a simple manner, it will be desirable to omit from 
consideration methods which are essentially ingenious forms of counting, such 
as are used in sampling without replacements from finite universes, and in 
finding the sampling distribution of a percentile. 

The general problem to be discussed may be stated as follows: N individuals 
(ti , • • • , tjr) are drawn, one at a time with replacements, from a universe whose 
probability distribution is <t>(t). A certain single valued function of the t’fi is 
formed. This is called a parameter of the sample, and is frequently also, 
but not necessarily, a useful estimate of the corresponding parameter of the 
universe. The problem is to find its probability distribution,/Or). As usual, 
a probability distribution is a function which is required to be defined, except 
perhaps at a set of measure zero, throughout the infinite domain of its variables; 
it is nowhere negative, and its integral over its domain is unity. 

Most of the more recent developments of the theory relate to a more general 
form of this problem. Instead of N individuals, there are N sets of n individuals 
in each set, and these sets are drawn respectively from M(M £ N) universes, 
each of which is described by a function of n independent variables, thus: 

( 1 ) •••,«;(»- 1 , •••,*). 


Instead of a single parameter there are P parameters, and each is a single valued 
function of the observed values of the nN individuals in the sample, thus: 

(2) * = gi (t[ n , • • •, £ l) ; • • • ; tl"\ ■■■, O; (* = 1, • • • , P) 

The first method to be described is fundamental and will be designated as 
Theorem I. Let it be required that each g as described in (2) be not only- 
single valued but also constant at most in a set of measure zero in the nN-way 
space of the <’s. Then 

(I) //(xi,... ,x P )dX = ••• ,C } )dT 


1 Presented to the American Mathematical Society at a meeting devoted to expository 
papers on the theory of statistics, April 11, 1936. 
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where X is the space of x’s and T the space of the f’s, p is any measurable set 
of points in X, and q is the set in T for which g is in p. Often p is the P dimen¬ 
sional cube (x,- + A x, t = 1, • • • P) at the point (xi, • • • , x p ) and then q is 
the set where 

(3) Xi ^ Qi g Xi + Ax; (i = 1, • • • , P) 
and 4> is the simultaneous distribution of the sets of t's, 

(4) *“’(<!“, • ■ •, i“) • • • • • •, O- 

In this is the universe from which the t U) set of t's is drawn. Obviously, 
if N > M, some of the <t > Uh s are identical, and then it is assumed that the several 
sets are drawn independently. Often, all of the N sets of f’s are drawn from 
the same universe. Then M = 1 and all these <#>’s are identical, and (4) becomes 


- fo (1, (C 




, «]. 


In the special case where there is but one parameter (P « 1) and but one 
individual in the sample (n — N = 1), and p is an interval, formula (I) becomes 


/ x-fa* r 

f(x) dx * I <f>dt ; 


and in the very special case where it is also true that q is an interval it becomes 
(lb) /(x) = <t>(t) f x 

provided also that certain derivatives (to be specified later in the proof) exist, 
where t is now the inverse solution of the equation, 

(5) x = g(t). 

The proof of formula (I) is immediate, if one is willing to assume the existence 
of the probability distribution /; for then the left side is by definition the prob¬ 
ability that the x’s lie in p, and this is also the meaning of the right side of (I). 
(Ia) can be proved without assuming initially the existence of f(x), for then 
the existence of f(x) can be inferred from the existence of the right side of (Ia), 
because f(x) may be set equal (except perhaps at a set of measure zero) to the 
upper right hand derivative, with respect to Ax (Ax is a variable, and x is fixed), 

of J <f> dt , provided that one adds the condition that this derivative is nowhere 

infinite. The point at issue here is merely the existence of a primative for a 
monotone increasing function of Ax. (Ib) may be derived from (Ia) by taking 
the derivative of both sides with respect to Ax, if the derivatives are continuous. 

Theorem I, in these various forms is used a great deal, especially in the last 
form (Ib). This affords one freedom to choose the most desirable function 
for purposes of tabulation. R. A. Fischer^ z distribution, a logarithm, is an 
important illustration. Many authors have been interested in so choosing the 
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function that its distribution shall be normal. They include several of the 
older writers, and more recently H. L. Rietz (1921, 1927), and G. A. Baker 
(1932, 1934). However, the theorem is of special importance in the theory, 
for all the other principal methods of obtaining probability distributions are 
essentially corollaries of it. These corollaries will be called Theorems II, III, 
and IV. 

Theorem II. Let p (the measure of p) and q (the measure of q) be infini¬ 
tesimals of the same order and let both the oscillation of /(i.e. maximum /- 
minimum f) in p and the oscillation of ^ in g be infinitesimals; then (I) may be 
written, 

(II) SP = 4><7, 

where / applies to any point of p and <f> to the corresponding point of q. This 

equation (II) is an approximate equation in the sense that differences of higher 
order than those retained are neglected. In particular, with the conditions 
used in formula (la), equation II becomes 


/Ax = <f>q. 


The left side of (II) is an approximation to the probability sought. The right 
side shows that, in order to evaluate it, one need only find the volume in T space 
of the differential element q and multiply it by the value of $ in q. Formula (II) 
expresses the so-called geometrical method used by many authors, e.g., by 
R. A. Fisher (1915, 1925), by Wishart (1928), and by Hotelling (1925, 1927). 
The chief difficulty in connection with it is in finding the volume of nN- dimen¬ 
sional q . In order to display the advantages and disadvantages of this method 
we shall pause at this point and look at a concrete example. 2 

Let two individual* (h , fc) be drawn independently from a normal universe 
and consider the simultaneous distribution /(x, y) of the sum, x = ti + U , 
and product, y = tik, the mean of the universe being chosen as the origin. 
Here N = 2, n = 1, M = 1, and so, 


( 6 ) 



2<r* 




1 

2tct 2 


e 


The point set q is the area lying between the two adjacent hyperbolae, 

ti h = y, tit* = y + Ay, 
and also between the two adjacent lines, 

ti ~ x, t\ = x + Ax, 

where Ax and Ay are infinitesimals and are equal. This area may be computed 
by simple integration and is: 


1 See also C. C. Craig (1936). Craig uses another method to be explained later (formula 
Ilia). 
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2Ax Ay 

^ y/x* — 4y 
= 0 

Hence II gives us immediately the desired result: 

g a —2y 

1 ~ 2or * 1 

/(*, y) AsAy = —, e -■ .. . ■ • Ax Ay, 

ir(7 x 2 — 4y 

= 0 if x 2 < 4 


if x* > 4y, 
if x 2 < 4y. 

if x 2 > 4 y 9 


If x 2 = 4y, § is an infinitesimal of lower order than p = (Ax) 2 , and so Theorem II 
does not apply. In this case we must go back to Theorem I, and from that we 
can learn that the probability, 

J f dx dy, 


is an infinitesimal of the first order if p = Ax At/ = (Ax) 2 is of the second order. 
Hence it cannot be approximately represented by a finite number times p. 
The oscillation of /in p is infinite. The form of the surface/(x, y) is interesting. 
The ordinates rise to infinity on the contour of the parabola x 2 = 4p, and vanish 
within it. The surface is symmetrical with respect to the plane x = 0, but 
not with respect to the plane y = 0. However, it is clear that the total prob¬ 
ability of any given product, y ( i.e . the probability of this y for all possible 
values of x), is the same as the total probability of — y\ hence 



/ /(*, ~ 2 /)dx, 
J— » 


and the corresponding formulae, 


2 

- e 


V 

1 



2 ** 


Vs 2 — 4 y 


- dx 


and 



2»* 


\/x i — 4y 


dx 


(y > o), 


( y < o), 


must be equal; both may be reduced to the single form 




if y j* 0. 


This is the probability distribution of y. 

With this example before us, let us now reconsider the theory: 

(*) The requirement (in II) that the oscillation of <t> be infinitesimal in q 
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Will be satisfied if one can show that 4> may be expressed as a continuous function 
of the parameters (xi, xt, • • • , a>). In our example these parameters were 
x and y and 4> was so expressible (6). But if we had tried initially to find by 
means of (II) the distribution of the product y, independently of what values 
x might have, we should have been stopped at this point, because <j> is not 
expressible in terms of y alone. We should also have been stopped by the 
requirement that q be infinitesimal of order Ay, for q would have been the 
space between two hyperbolas and its area for any fixed (Ay > 0) would have 
been infinite. But, when thus stopped at that first point, it would have been 
clearly indicated to us that the distribution of y might have been found via 
the detour of finding the simultaneous distribution of both x and y, because 
an attempt to express <t> in terms of y would have led to the given expression in 
terms of both x and y. For a similar reason R. A. Fisher (1925) was able to 
find the distribution of the variance by finding first the simultaneous distribution 
of the variance and the mean. Also, he was thus able to find the distribution 
of the coefficient of correlation by finding first the simultaneous distribution of 
all the first and second order moments. 

(it) A distinct advantage of this method is that q is independent of the 
universe <t>, so that once found it may be used in connection with any universe 
which satisfies the condition that it can be expressed as a continuous function 
of the parameters. Thus, the distribution of the sum and product in our 
example may equally well be found for the universe described by the Type III 
curve, Ate^Xt > 0). For, then 

<t> = A 2 titt e~ a(,1+h) = A 2 y e~ ax , 

and so, using one-half of the same q as before, since now x, y ^ 0, 


fix, y) - A 2 y e~“* 


= 0 


\/x 2 — 4y 


From this, F(y) can be found by integration (c.f. Kullbach, 1934) 


F(y) = A 2 y 


r -j 

J\/iv \/X 4 


V5 Vz 2 - 4 y 




r oo N 

( *- 

Jo u 


if 

if 

du . 


x > 4 y } 
x 2 < ty. 


As another illustration, consider a normal universe of n intercorrelated vari¬ 
ables in which all the total intercorrelations are equal to r ( e.g. y the statures of 
n brothers) and let the sample be a single group of n (one individual for each 
variable). 

* ~ (2r) n/2 R 6 

where R = (1 — r) B-1 [l — (n — l)r], h — (1 — r) B-s [l — (n — 2)r], and 
k» = — r(l — r) B ~*. Suppose one wishes to find the simultaneous distribution 
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of the variance x and the mean y for such samples.* Since for Student’s problem 
Fisher has found the value of q for this x and y to be 

n-8 

q = cx 2 Ax Ay, 

their distribution f(x , y) for this universe may be written down immediately. 
In terms of x and y the bracket in the exponent of <f> is y 2 (fen — fen + fen 2 ) 
+ xn(ki — fe), and so f(x, y) is the product of q and this form of 4>: 

n —S I 

f(x, y) = Ke B x~ r , E ^ 2R ~ + kn’V “ n ^ kl ~ **)*]• 

(tit) Another attribute of this method is that it sometimes lends itself to easy 
extensions from a simple case where there is only one restriction (N — 1 degrees 
of freedom) to similar cases when there are more restrictions. Thus R. A. 
Fisher (1924) proceeded from the variance of a sample from a single universe 
to the variance from a set of universes, as required in the theory of analysis of 
variance; and thus also (1915) he had proceeded from the distribution of r to 
that of multiple R; and Hotelling (1927) showed how these distributions could 
be obtained when the values of each variate were themselves intercorrelated 
(as in a time series) and not merely correlated with values of the other variates. 

Theorem III. Now let us consider again the fundamental form (I). For 
convenience let nN = m. If the conditions will not permit us to write the right 
side in the form in (II), it is still possible that we may be able to find that 
(m + l)-dimensional volume by some other method. In particular, whenever 
it is possible to iterate the integral once we have the formula: 

(in) JjdX - jf dT' 

where q m is the section of q by t m space at the point (h, • • • , £ m _i) of T* space, 
T f space being the space of the (ti , • • • , t m -i) coordinates. With added condi¬ 
tions one may deduce from (III), for the case where there is but a single para¬ 
meter x , the approximate equation: 

(Ilia) fdx = dx f dT' ■ *«„ •••,<„) 

in which t m is supposed to have been expressed in terms of the other coordinates 
by solving the equation x = g(t \, • • • , t m ). It is an approximate equation in 
the same sense as (II) was. Sufficient conditions for this change in the left 
side of (III) have already been mentioned in discussing (II). The propriety 
of making the corresponding change in the right hand side may be left for 
determination when the form of <t> is given. It will perhaps be sufficient here 
to point out that our earlier example illustrates both the case where this change 



* A special case of a more general problem solved first by R. A. Fisher. 
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is permissible and where it is not. For, let it be required to find the distribution 
f(y) of the product y = t\U without reference to the sum, + U- Formula 
(III) yields 


(7) 



dy - 2 


r» r(v 

/ dh / 

Jo JyltI 


(y+Ay)/<i 




This is valid for every value of y including y — 0. If y ^ 0, we may change 
the right hand side as in (Ilia) and obtain as the probability that y is in the 
interval (y, y + Ay): 


( 8 ) 





dti + «, 


where «is a differential of higher order than Ay. This may be proved by com¬ 
puting the difference between the value of (7) when U has constantly the value 
(y -|- Ay)/t x and when it has constantly the value y/ti. If y — 0 this change 
in the right side of (7) is not valid; it is easily seen that in this case the integral 
on the right of (8) is infinite. It may be shown, however, in this case that 


*A v 



a$d that this is an infinitesimal, and that it is of order as small as one. 

Many authors think of (Ilia) as the fundamental formula in the theory of 
probability distributions. One of the simplest and earliest applications of it 
was to establish the so-called reproductive property of the normal law: that 
the sum of two variates is distributed normally if each is distributed normally. 
Jackson (1935) has used it to establish a similar property for two Type III 
distributions which have the same exponent of e. Usually this integral is 
difficult to evaluate when N > 2 because of the unsymmetrical form into 
which it is cast, but when N = 2 and there is but one parameter (Ilia) it is 
perhaps the most convenient of all the formulae. 

Theorem IV. An exceedingly useful formula is obtainable from (I) in the 
following manner. Let 6(x i, • • * , x P ; a x , • • • , oq) be a finite single valued 
function of the old parameters (x) and of some new parameters (a). Subject 
to general conditions to be stated we may write: 

(IV) jf Of dX — ^ 6'<t> dT, 

an identity with respect to each a, where O' is the result of substituting (2) 
for the x’s in 0. 

Since this theorem has not been proved in this general form, an outline of 
the proof will be given. Sufficient conditions are: 

(o) All the integrals involved shall exist. 
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(6) If p is limited (in the sense that it lies within a finite hypersphere), so 
is q, and conversely. 

Proof. Let X 0 be a limited p set and To the corresponding q set such that 
both (c) and (d) hold (e > 0): 


(c) 

[ fddX - 

f fddX < €, 


Jx» 

J* 

(d) 

[ W dT — 
Jr o 

J w dT < t. 


It is easy to see that such an X 0 and a corresponding To do exist, as follows: 

Let X f 0 be a limited set for which (c) is true, and for which it will remain 
true no matter what points are added to Xo . Similarly, let To be a limited 
set for which ( d ) is true and for which it will remain true, no matter what 
points are added to To . Presumably Xo and T f 0 do not correspond to each 
other, but we may now let Xo be the totality of all the points of Xo and of all 
those points of X corresponding to To, and let To be the totality of all the 
points of To and of all those points of T corresponding to Xo . Then X 0 and 
To do correspond to each other and have the desired properties (c) and (d). 
Now, since 6 is finite, it is limited in X 0 . Let 

(e) | 0 | < H inX 0 . 

Divide the interval (—//,#) into $ equal subintervals of length h , thus defining 
in X 0 according to Lebesgue the measurable sets, 

Pi [i a® 1, • • • , $), and corresponding q *■ sets in T 0 : 


I Os 6 g h in p,, 
Os O' tk h in q ,. 


Choose arbitrarily any point of p t - and let ft* be the corresponding value of 6. 
Then let 


Then 


and 


Since by (I) 


(g) 


0 = ki in pi (i = 1, • • • , s), and similarly let 
$' = ki in g, (i = 1 , • • • , s). 




Cl an 


"Now 


and 


0 


9 )fdX f \ 9 - 0 \fdXgh [ fdX, 
Jx 0 Jx t 

jf 0' - O') dX gh J 


gh 4>dX. 

Jto 

So, as h approaches zero both sides of (g) approach limits and their limits are 
equal: 

I 6fdX = [ 6'<t> dT. 

Jx 0 JTq 

Hence by (c) and (d) the integrals 

J^dfdx, 

differ at most by 2e, and so, being independent of e they do not differ at all. 

In order to determine the form of / from (IV) one must first evaluate the 
right side, 


l 


1 dt = iKo-i, • • • , a Q ); 
and then solve the integral equation, 


( 10 ) 


i 


efdx = +. 


It is the solution of this equation that usually presents the most difficulty. 
Particular forms of 6 that are being used are 

(11) $ = e “i*+'-+*e*e 

in which case ip is said to be the “characteristic function” or “moment generating 
function”; and 


( 12 ) e . xV • • • xV, 

in which case ip is a “moment function” or “moment” of /. Other forms might 
be used. For example, a very convenient method of demonstrating the correct¬ 
ness of the usual formula for the simultaneous distribution of the correlation 
(a;), means (y, z), and variances (u, v), in samples from a normal bivariate 
universe is by the use of 

q ^ (« ! + v* + y* + **) + aj(uw + H*) 


This method of finding / is not a final determination of the probability function 
desired until it has been shown that the solution is unique, a serious problem 
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in itself; it is one of those which Professor Shohat may consider. 4 There are 
three methods of solving the integral equation (10): 

( i ) The first might be called guessing. Though unscientific, it is in fact 
often effective. Especially is it available if the distribution has already been 
surmised but not demonstrated. Thus, it was open to Student (1908) when 
he correctly surmised the distribution of the variance. Similarly it was open 
to Soper (1913) when he incorrectly surmised the distribution of r. 

(it) Papers by Romanovsky (1925) and Wilks (1932) have shown how the 
problem of solving the integral equation may be shifted to the problem of 
solving a partial differential equation, but this in turn may involve the solution 
of another equally difficult integral equation in the process or determining the 
arbitrary function. 

(in) If each a be replaced by an imaginary fli and one uses a Fourier trans¬ 
form, one arrives at a set of formulae which are most important. For the case 
where there is but one x and one /3, they may be written: 


(13) 

J\ i0x f(x)dx = 

= m. 

(14) 

H<n 

II 

/*N 

" e~ i01 m dp. 

-oo 


Dodd (1925) has given an equivalent set of formulae involving only real vari¬ 
ables. It is easy to prove that both sets may be changed to the single formula, 

(15) f(x) = - / <t>dt I cos p(x - g) d/3. 

7T JT JO 

Kullbach (1936) has established the validity of the formulae corresponding to 
(13) and (14) for the general case of (P + Q ) parameters. Wishart and Bartlett 
(1933) used the general forms to find the distribution of the generalized product 
moment in samples from an n-dimensional normal system. 

When the solution of the integral equations of (IV) cannot be found, one 
has to put up with the semi-invariants or with the moments of /. Formulae 
(IV) and (11) yield the semi-invariants, (IV) and (12) the moments about the 
given origin, and from either of these one may obtain the moments about the 
mean point. These methods are old but they are still important. Time does 
not permit me to discuss them, because it would not be proper to close this 
paper without some reference to limit methods. 

Limit Methods. It is well known that the distribution of means of samples 
taken from almost 6 any universe approaches the normal law as a limit as N 
becomes infinite. This theorem is subject to great generalizations, as is indi¬ 
cated in papers of A. Liapounoff (1901), S. Bernstein (1926), Romanovsky 

4 In a later paper at the same symposium. 

4 There are exceptions. E. g. t means of samples taken from the universe a/V(a 4* t*) 
have a distribution identical with the universe itself. 
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(1929, 1930) and C. C. Craig (1932). Subject to very general conditions it 
has been shown that: If the characteristic function of one probability distri¬ 
bution contains a parameter and approaches as a limit, uniformly in every 
finite domain of its variables, the characteristic function of another probability 
distribution; then the first distribution approaches as a limit the second distri¬ 
bution. Hence S. Bernstein and Romanovsky have shown that: If the universe 
is an n-way correlation solid of a certain very general type, then the n means 

obtained by a selection of a sample of N sets of variates, x, = ~ (t ix + • • • + U *), 

(t = 1, • • • , n), have a distribution which approaches as a limit a normal 
correlation solid as N becomes infinite. A similar theorem has been established 
also in the interesting case of Romanovsky’s “belonging coefficients”, which 
include K. Pearson’s coefficient of racial likeness. Also, by the method of 
maximum likelihood, Hotelling (1930) has proved that under certain general 
conditions all optimum estimates of the parameters of a frequency distribution 
have a joint distribution approaching the normal as N becomes infinite. The 
validity of the method of maximum likelihood when used for this purpose has 
been established by J. L. Doob (1934). 

Finally, one may note an apparently new limit theorem of another type. 
Its general nature will be obvious from the following application: 

Let a sample of N be drawn from the universe, 

<t> = Ae~ att \ if t > 0, 

= 0 if t £ 0. 

It is readily proved, by means of (IV), that the distribution f(x) of the para¬ 
meter, 

x = {ti +•••+<*) 

is a curve of the form, 

f(x) as Bx N ~ l e" rU where x > 0, 

= 0 elsewhere. 

Now let X become infinite. The universe approaches as a limit the rectangle: 

= A where 0 ^ t < 1, 

= 0 elsewhere. 

The parameter x approaches as a limit X, where X — maximum ti. The 
distribution f{x) approaches as a limit the new distribution, 

F(X) = NX"' 1 where 0 < | X | < 1, 

« 0 


elsewhere. 
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Hence we have proved in a new way, what was already known: that the distri¬ 
bution of the greatest variate obtained by sampling from a rectangular universe 
is of the form F(X). 

The limit theorem implicit in this illustration can be established in sufficient 
generality, but I do not yet know-whether it has other applications of value, 
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MOMENT RECURRENCE RELATIONS FOR BINOMIAL, POISSON 
AND HYPERGEOMETRIC FREQUENCY DISTRIBUTIONS 1 

By John Riordan 

1. Introduction. This paper gives the development of recurrence relations 
for moments about the origin and mean of binomial, Poisson, and hyper¬ 
geometric frequency distributions from the basis of the moment arrays defined 
by H. E. Soper. 2 3 * * * * This procedure has the advantage of expressing the moments 
in terms of coefficients which are alike for the three distributions and are de¬ 
rivable by a single process, thus providing a degree of formal coordination of 
the distributions. For both kinds of moments, the coefficients satisfy relatively 
simple recurrence relations, the use of which leads to recurrence relations for 
the moments, thus unifying the derivation of these relations for the three 
distributions. The relations derived in this way for the hypergeometric dis¬ 
tribution are apparently new. Apparently new recurrence relations for certain 
auxiliary coefficients in the expression of the moments about the mean of 
binomial and Poisson distributions are also given. 

This course of development involves repetition of a number of well-known 
results which is justified, it is hoped, by the unification obtained. 8 


1 Presented to the American Mathematical Society, Sept. 3, 1936. 

2 Frequency Arrays, Cambridge, 1922. 

3 The following bibliography is taken from a paper On the Bernoulli Distribution , Solo¬ 
mon Kullback, Bull. Am. Math. Soc., 41 , 12, pp. 857-864, (Dec., 1935): 

A. Fisher, The Mathematical Theory of Probabilities , 2d ed., p. 104 ff. 

H. L. Rietz, Mathematical Statistics, 1927, p. 26 ff. 

V. Mises, Wahrscheinlichkeitsrechnung, 1931, pp. 131-133. 

Risser and Traynard, Les Principes de la Statistique Mathbmatique, 1933, pp. 39-40 and 
320-321. 

V. Romanovsky, Note on the moments of the binomial (q + p) n about its mean , Biometrika, 
vol. 15 (1923), pp. 410-412. 

A. T. Craig, Note on the moments of a Bernoulli distribution , Bull. Am. Math. Soc., vol. 40 
(1934), pp. 262-264. 

A. R. Crathorne, Moments de la binomiale par rapport & Vorigins, Comptes Rendus, vol. 198 
(1934), p. 1202; 

A. A. K. Aygangar, Note on the recurrence formulae for the moments of the point binomial , 
Biometrika, vol. 26 (1934), pp. 262-264. 
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Ch. Jordan, Statistique Mathematique, Paris, 1927. 

K. Pearson, On Certain Properties of the Hyper geometric Series . . . , Phil. Mag., 47, pp. 236- 
246 (1899). 
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2. Moment Arrays. As developed by Soper, frequency distributions may be 
exhibited by frequency arrays, in the case of a single variate, in the form; 

(2.1) f(A) = Zp.a* 

x 

where p, are the frequencies with which the measures, x, of the character, A, 
occur in a population. 

The substitution A — e a leads to the moment about the origin array: 
f(e°) = E P* eT 


* - Ep.(l + *« + ^ + •) 


where 


Z a 


% = Px X* 


The symbol a is a logical or umbral symbol serving merely to identify the 
moments in the expansion of the array. 

The moment array for moments about the mean is found from the relation: 

*(0 « e " mi 7(0 

= Em* 

« 

where mi is the first moment about the origin. 

The moment arrays for the distributions concerned are as follows: 

Binomial f(e a ) = [1 + p(e“ - 1)1" = E M p*(«* - 1 Y 

x-o \ay 


Poisson 


m . = ± w-jr 

x»»o X ! 

. ^ V' (l).(r). (c“ - ir 


Hypergeometric f(e“) = E -t- V- -- -- r~~ 

x—0 3? ! 

where the parameters p, n, and a for the binomial and Poisson have the usual 
significance. The parameters for the hypergeometric distribution, with the 
substitution r = s, follow Soper; Pearson (loc. cit.) uses q , r, n, where q = l/n. 
The notation (Z) x means 

(Ox = 1(1 - 1) • • • (I - * + 1). 

It will be seen that, with the usual interpretation of as zero for x > n, 
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the three distributions so far as concerns a may be exhibited by a function 
of the form 

/(O - ±A,(e* - 1T 

. a-0 

where A x of course depends on the distribution concerned. 


3. Moments About the Origin. 

defined by the equation: 

so « 


(3.1) 



The moments about the origin can then be 

- ±Ajf - ir 

*HI 


and 


t A*(e° - iy = £ A x t (- 1)~ (*) e' 

x—0 X—0 0 W 


= 2 -I 
*-0 5 I z—0 


where is a Stirling number of the second kind, as used by Jordan (loc. cit.) 
and defined by 

x\S x ,.= £(- ir(j.* = A*0*. 
t>-o W 

A x 0* being in the language of the finite difference calculus, a “difference of 
nothing” that is A x n | n = 0. 

The internal series terminates at s because S Xt * = 0, x > s, as is readily 
apparent in the finite difference expression. Further So.a = 0, s 0; So,o = 1. 

By equating coefficients in equation (3.1), m B) the sth moment about the 
origin, is given by 

9 

(3.2) th 8 = ^ ^ x \ A x S x , $. 

x“0 


The particular forms for the three distributions are as follows: 

8 

(3.3) m, = 2 W* p* Sx, a Binomial 

x—0 
* 

(3.4) m, — a* S x ,. Poisson 

x-0 

(3.5) m, = 2 * S x ,. Hypergeometrie 

x-o (n)x 

The Stirling numbers have the following recurrence relation (Jordan loc. 
cit.): 

(3.6) 


S x , .+1 = xSz.t + Sx- 1 ,B. 
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This relation in conjunction with equations (3.3)-(3.5) leads to moment recur* 
rence relations. The procedure is illustrated for the binomial distribution as 
follows: 

»+i 

at 3C (n)» p* Sx, t+1 
*—0 

= 13 (n)x V* (* Sx.. + <S_ 1 ,,) 

*-0 

= p Dp m, ( npm, — p 1 D p m,) 

— npm, -f pq D p m, 

where q = 1 — p. 

The steps in the process are expanded as follows: 

22 (^)*p r Sx,, ~ ^ (ft)* p r 5*, i 

x—0 

= 23 («)* &. • pD P (p z ) 

x—0 

= pD p m, 

<4-1 <4-1 

Z) (»)* P* . = 23 (« - x + 1) (n)x_i p x , 

x—0 x—0 

= n 23 (n)*p* +1 5 Xl . - 2 x(n) x p x+1 S x ,, 

x—1 r—1 

= npm, — p i D p m, 

The results for the three distributions are as follows: 


(3.7) 

m ,+1 = npm, + pqD p m, 

Binomial 

(3.8) 

m ,+1 = am, + aD a m, 

Poisson 

(3.9) 

It 

m ,+1 = - m,(l — 1, r — 1, n — 1) — (n + l)A„m, 
n 

Hypergeometric 


Here D p and D« denote differentiation with respect to p and a, respectively, 
and A n denotes the difference operation with respect to n. For the hyper¬ 
geometric distribution the moments are functions of l, r, and » as well as of s; 
m,(l — 1 , r — 1, n — 1) is the same function of l — 1 , r — 1 and n — 1 as 
m,(l, r, n) is of l, r, n. Equation (3.9) appears to be new. 
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For convenience of reference, a short table of the Stirling numbers of the 
second kind follows: 


\ - 



8., 

« 




0 

1 . 

2 

3 

4 

5 

0 

1 






1 

0 

1 





2 

0 

1 

1 




3 

0 

1 

• 3 

1 



4 

0 

1 

7 

6 

1 


5 

0 

1 

15 

25 

10 

1 


4. Moments About the Mean. As shown in Section 2 above, moments 
about the mean may be defined as follows: 

(4.D 

a -0 s l 0 

where m\ is the first moment about the origin: 

mi — np Binomial 
= a Poisson 
= Ir/n Hypergeometric 

Now 

Z A x e~ mia (e“ -1)‘=E^E (- l)* - " ( X ) e iv ~ miU 

x—0 x-*Q v-*0 V* / 

= “i £ ! A, <r*,, 

s —0 8 ! x —0 

where 

x! <r„. = i, (- ir* (f) (v - mi y = a 1 (- mi y. 
v / 

It will be observed that for mi = 0, <r x , a = S x , a. The internal series terminates 
at s for the same reason as before. 

The moments about the mean are then given by: 

s 

(4.2) m» = E x! A x <r x ,, 

*-0 


The particular forms for the three distributions are as follows: 


(4.3) 

M. = 12 ( n)« p» <r»,. 

Binomial 


x—0 


(4.4) 

« 

M* = 2 aV,., 

Poisson 


x—0 


(4.5) 

y> (0* (r)* 

M* — Zw / \ * 

x-o w* 

Hypergeometric . 
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The coefficients ?»,, satisfy the following recurrence relation: 4 
(4.6) — (x — mi)irx,$ + 

which in conjunction with equations (4.3)-(4.5) leads to moment recurrence 
relations as before. The actual derivation is somewhat complicated by the 
circumstance that a x ,, is a function of mi and therefore of the frequency param¬ 
eters, rather than a constant as before. The derivation is illustrated for the 
binomial distribution as follows: 

«+i 

M*+i — ]C ( w )» P 0*. i+i 

X—0 
«+1 

= £ ( n) x p*[(x - np)a x ,. + <r x _i..] 

x—0 

8 «+l 

= £ (n), «•*,. pD p (p x ) - npn, + £ (n), p* <r,_i,. 

x—0 x—0 

= pZ) P M. + nspn.~ 1 - npn. + np/i. - P^pM. + ns/i—i] 

= pq [wsm-i + 

The steps in the process are expanded as follows: 

£ ( n) x <r z .,pD p (p *) = £ (n)*[pDp(pV,,.) - p T pD p (a x ,.)] 

x—0 


= pDpju, — p £ (n), p*( - ns<Tx, ,_i) 

x—0 

= pD P n, + nsp/it-i 

8+1 8 + 1 

xP Vx-l , 8 ^ X) ^ 1) (^)x— 1 P &X—l,t 

x—0 x—0 

= n £ (n)x p I+1 ffx,. - £ x(n)x P* +1 v*.. 

x—0 

= tip fX$ p [Dp fl a -f- 

The relation D p <r Xt a = — nso-x.a-i is obtained from the definition equation of 
o'*, * (with mi = np). 

The resulting recurrence relations for the three distributions are as follows: 

(4.7) M«+i = nspq Ms-i + pq D p p 9 Binomial 

(4.8) Mt+i = + a D a Poisson 


4 Jordan, loc. cit. or E. C. Molina, An Expansion for Laplacian Integrals . . . , Bell 
System Technical Journal, 11 , p. 571. 
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(4.9) 


where 


M«+i 


(n+ 1) , r, n + 1) J 

n [* * £ C)® - hr-l,n-l)] 


Hypergeometric 


lr lr 

»(n +1) * » 

q-D(r- 

U -1) 


lr 

n 


The last of these, which appears to be new, seems to be of formal interest only. 
The coefficients <r x ,» are related to the Stirling numbers by the expression: 


Ox 


.. - £ (-D* (*) S x ,ml = £ a v m\ 

r—0 Vv 0 


and consequently can be exhibited with detached coefficients in the form 
do + ai + a 2 + ••• + a,_ x . For the binomial and Poisson distributions 
certain simplifications, to be developed in the section following, in equations 
(4.3) and (4.4) may be made. For the hypergeometric distribution it appears 
necessary to use equation (4.5); the following short table of <r x , a , employing the 
detached coefficients mentioned above, is given for this purpose: 


1 

2 

3 

4 

5 


0-1 
0+0+1 
0+0+0—1 
0+0+0+0+1 
0 + 0 + 0 + 0 + 0-1 


1-3+3 3-3 ,1 

1—4+6—4 7-12+6 6-4 1 

1-5+10-10+5 15 - 35+30-10 25 - 30+10 10 -5 1 


5. Binomial and Poisson Moments About the Mean—Simplified Formulas. 
5.1 Binomial. From examination of the first few moments about the mean, 
it appears expedient 6 to write the formulas: 

a 

Wo = £ot x ,t, ( npq) x 
, . *~1 
(5.1.1) 

W.+i = (q — p)£ocz, *.+i ( npg) x 


4 The kind of expression chosen admits of some variety. A recurrence relation for 

t 

coefficients in the expansion • P* has been given by E. H. Larguier, On a Method 

*—1 

For Evaluating the Moments of a Bernoulli Distribution , Bull. Am. Math. Soc., 42 , 1 , p. 24 
(AbBtraot 8); I am indebted to Mr. Larguier for the opportunity of examining his results 
in advance of publication. 
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When these are substituted into the moment recurrence relation, the coefficients 
are found to be related as follows: 

«*.»• = [x + pqD M ]a x ,u-\ + (2s — 

—2pg[l + 2* + 2pgZ) OT ]a*4,_i 

«*,*•+! = [x + PQD pq ]a z ,u + 2sa I _i,s._» 


or, in general, 


(5.1.2) 


a r .t+i — [* + pqDpi]<x*.t + 

“ P?[l ~ (-1)'] [1 + 2a: + 2pqD pq ]a t ,, 


Using detached coefficients of powers of pq as outlined above, these coeffi¬ 
cients may be exhibited as follows: 



2 

3 

4 

5 

6 

7 

8 
9 


1 


a.,. _ 

2 3 4 


1 

1 


1 - 6 
1 - 12 

1 - 30 + 120 
1 - 60 + 360 
1 - 126 + 1680 - 5040 
1 - 252 + 5040 - 20160 


3 

10 

25 - 130 15 

56 - 462 105 

119 - 2156 + 7308 490 - 2380 

246 - 6948 + 321J2 1918 - 13216 


105 

1260 


It may be noted that the coefficients of the first column in conjunction with 
equations (5.1.1) give the binomial seminvariants. 

Equations (5.1.1) make the coefficients functions of pq only; a slight alter¬ 
ation makes the coefficients functions of n only. Thus: 


(5.1.3) 





X—1 


(pg)* 


ms*+i = (g - p) 2 Px,tt+ x (pqY 


and the coefficients are found to satisfy the recurrence relation: 

(5.1.4) = xp x ,, + - [1 - (—l)*](2x - l)/J*-i.«. 

These coefficients may be exhibited by a rearrangement of the table given 
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above as may be seen by comparing equations (5.1.1) and (5.1.3). The first 
few coefficients are as follows: 

\* 

•\ T 

2 1 

3 1 

4 1 

5 1 

6 1 

5.2 Poisson. The Poisson moments about the mean may be expressed as 
follows: 

[•/*i 

(5.2.1) n, = ^ “*.»«* 

^ x—0 

where [ ] represents “integral part of” and 

(5.2.2) Gtx, «+l = Xttz, a "I" *«x-l, i—1* 

The coefficients a*,, are the constant terms in the expressions for the corre¬ 
sponding binomial distribution coefficients in powers of pq. 


n' 1 fl.,. _ 

2 3 


-6 + 3 
-12 + 10 

- 30 + 25 120 - 130 + IS 


Bell Telephone Laboratories. 



NOTE ON ZOCH’S PAPER ON THE POSTULATE OP THE 
ARITHMETIC MEAN 

By Albert Wertheimer 

1. Introduction. There appeared recently a paper by Richmond T. Zoch 1 
entitled “On The Postulate of the Arithmetic Mean.” The stated purpose of 
his paper, was to show that the derivation of the Postulate as given by Whit¬ 
taker & Robinson, is not correct. It is the purpose of this paper to show, 
that Zoch has not proven any error to exist in the Whittaker & Robinson deri¬ 
vation, but that there are a few errors in his paper. As this paper is intended 
to be read with Zoch’s paper as a reference, the terms used there will not be 
redefined here, and except where otherwise stated, the symbols used will have 
the same meaning. 

2. Zoch introduces the function 

/ 55 x + ans /hi 

and claims that it satisfies all the four axioms of Whittaker & Robinson, and 
obviously it is not the arithmetic mean. He therefore concludes that their 
derivation must have errors somewhere, and proceeds to find them. Let us 
first examine the / function. Considering only the part ms/m 2 , the partial 
derivatives with respect to Xi are given by 

3m2 { - x) 2 M 21 - 2^3 (xj - x ) 

nfi 2 

It is then stated (p. 172) “. . . clearly these partial derivatives are single valued 
and continuous. Therefore the function M3/M2 satisfies axiom IV.” Now, 
the condition that a function be continuous and single valued means of course 
that this be true throughout the region of definition of the function. It is not 
shown how these derivatives are clearly continuous and single valued for the 
very important case where all the x’s are equal and the derivatives become 
indeterminate. As a matter of fact they are not continuous in this case, and 
therefore the f function does not satisfy axiom IV. To prove this, we only 
have to consider the very simple case where we let 


Xi = k + CiZ 


'This Journal Vol. VI no. 4, Dec. 1935, pp. 171-182. 
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where h is a fixed constant, c< is a set of arbitrary constants not all equal, and 
2 is a parameter. We then have 

£ - k + 8z 
' 2 

Ml = Ml2 

' J 

MS = Ml« 

where 

c = l/n Z c, 

mi = 1 /n Z (c< — 2)* 

Ms = l/n Z (c.- - if 

Substituting these values in / and the derivatives, we get taking a = 1, 

/ = k + zc + zVi/zVi 

a//9x, _ l/n + 3»'w{«’fa - »' - «•») _ - ~ a 

nz /(« 

Now going to the limit when z approaches zero, and all the x’s approach k, 
we get 

limit / = k, 

2 ~~*0 

limit df/dXi == l/n[ —2 + 3(c* — c) 2 /m 2 — 2 /is(c,* — c)/m 2 2 } 

Thus, when all the x's approach the same value, the function/also approaches 
the same value independent of the c’ s, that is regardless of the mode of approach, 
while the derivatives can take on any value depending on the c’ s that is on 
how the limiting value of / is approached. The / function then does not have 
continuous single valued partial derivatives, and therefore does not satisfy 
axiom IV. 

In part 2 of the paper it is stated “Now when the Xi all approach a then both 
/ and df/dXj become indeterminate forms. However, in this case / takes an 
indeterminate form which can be evaluated and it can be shown that ms/ms 
will always have the value zero, i.e., / will have the value a when all the Xi a; 
while the df/dXi can take any value whatever and in general the df/dXi will 
not be equal when the s* a.” This statement really amounts to saying that 
the / function does not satisfy axiom IV, but it is there used to demonstrate 
that one of Schiaparelli’s propositions is false. 

3. Having exhibited a function different from the arithmetic mean, and sup¬ 
posedly satisfying all the four axioms, the question is asked Where is the proof 
given by Whittaker & Robinson lacking in rigor?” After numbering the 
various steps in the derivation “... for the sake of rigor and careful reasoning 
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..it is stated (p. 174), “The sixth step involves the tacit assumption that 
the partial derivatives are functions of k. These partial derivatives are not 
necessarily functions of k ...” and it is therefore concluded that the sixth 
step is not valid. Now, how can any function that by definition is to be evalu¬ 
ated at Bkxi not be a function of fc? What is shown (pp: 174-5) is that 
these derivatives do not necessarily involve k explicitly, but this is neither 
implied nor necessary for the sixth step, and there is no ground for doubting 
its validity. 

4. In order to overcome the supposed defect in the sixth step, it is proposed 
to change axiom IV so as to require the partial derivatives to be constants. 
But even then (p. 175) “... there remains an objection in the seventh step.” 
Now, the seventh step consists of the statement that if 

4>(Zi) = HciXi 

where the c’s are independent of the x’s then due to the condition that p be a 
symmetric function, all the c’s must be equal. To show the defect in this 
step it is stated, that under certain conditions “... the function / = x -f nt/m 
will have partial derivatives with respect to x< which are unequal and constant; 
yet at the same time the function / is a symmetrical expression of the n vari¬ 
ables.” Granting that all that is correct, what has this got to do with the 
seventh step? The / function certainly is not of the type X) c < x > to which 
the seventh step is applied. 

5. One more point should be mentioned. On p. 181 it is supposedly proven 
that any function satisfying the first three axioms must have continuous first 
partial derivatives. The proof is essentially as follows: Assuming all the x’a 
are given the same increment Ax, the increment of the function then is A p. 
It is then stated “... but by axiom I, Ap = Ax. Therefore A p/Ax = 1 = dp/dx. 
In other words, the total derivative of p exists and is constant. Therefore the 
total derivative of p is continuous.” From this, the continuity of the first 
partial derivatives is proven by means of Euler’s Theorem for homogeneous 
functions. Now, just what does the symbol dp/dx (which is called the total 
derivative) mean for a function of many independent variables? Besides, 
(whatever this symbol means) is it considered rigorous to deduce a general 
Theorem from the very special case where all the differentials are made equal? 
This is one place where the / function could be used effectively as an exhibit 
of a function satisfying the first three axioms, and not having continuous partial 
derivatives. 

It is also stated (p. 181) that “... it would seem more satisfactory to postu¬ 
late that the function p is single valued, for the single-valuedness of a derivative 
does not insure the single-valuedness of the integral while the single-valuedness 
of a function does insure the single-valuedness of the derivative where the 
derivative exists.” This statement is certainly not self evident and requires 
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proof. For a single variable at least, it is easy to imagine a function repre¬ 
sented by a curve with comers defined in a certain interval. The function then 
could be single valued everywhere in the interval, while the derivatives at the 
comers may exist and have two. distinct values, depending on whether the 
comer is approached from the right or the left. On the other hand it is hard 
to imagine a curve representing a single valued function such that the integral 
i.e. the function represented by the area under the curve should not be single 
valued. 

6. In Conclusion: It is stated in the Introduction that “Since this book has 
had wide circulation, it is believed that the errors in this proof should be called 
to the attention of the users of the book. The present paper has been prepared 
for this purpose." It is for the same reason, that this paper was prepared to 
show that no error has been proven to exist. 

Bureau of Ordnance, U. S. Navy Department 



NOTE ON THE BINOMIAL DISTRIBUTION 

By C. E. Clark 


The purpose of this note is to show that 

(1) 

where n is an integer ^0,0<p<l,p + ? ss l, and x (w+1> = x(x — 1) (x —2) 
• • (x — n), is a function whose values at x = 0, 1, 2, • • • n are the successive 
terms of the expansion of (q + p) n , and also to consider the problem of fitting 
}(x) to an observed frequency distribution. 

The statement made about (1) can be verified by evaluating (1) as an inde¬ 
terminate form. On the other hand, (1) can be derived by observing that the 
x-th term (x an integer) of the expansion of (q + p) n is 


( 2 ) 


n\ * n-* = r ^ n + 1)P*9* X 
x\(n - x )! V q T(x + 1) f (n - x + 1) 7 


then (1) can be derived from (2) by means of the product expansions for r(x) 
and sin x . This derivation of (1) from (2) can also be carried out by expressing 

(2) as a Beta function and then using 


B(x + 1, n — x -f 1) 


-f 


f 


(l + 0 n + 2 


dt = (-l) n 


(»-f l) 


(n + 1)! sin wx ’ 


This integration can be performed by means of the theory of residues. 

Consider the problem of fitting (1) to an observed frequency distribution. 
We shall write (1) in the form 

(3) F(z) = ab* , x - j —~ h + h(z - 2) 


and determine the constants a, b, n, and h so that, when 2 is the mean of the 

observed distribution, F(z) will fit the distribution. 

The values of a, b, n, and h can be determined by the method of moments. 

Let Vi, vs , and v *, denote the usual second, third, and fourth moments of the 

distribution, which are calculated in the usual way (as in W. P. Elderton, 

Frequency-Curves and Correlation) and not adjusted by any procedure such as 

2 

Sheppard's adjustments. Also, use the usual notation ft = ~ and ft = ~. 

Vi Vi 
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Then, the method of moments gives 


(4) 

(5) 


2 



a = (—l) w - 7 \ — - N , where 2/ is the sum of the frequencies of the distribution. 

7r(l + 0) n 

An integer n is chosen nearest the value assigned by (4). The two values of 
6 from (5) determine two curves that are congruent but whose skewnesses are 
of opposite sign. Hence, b is uniquely determined by (5) and the sign of the 
skewness of the data. 

For a symmetrical distribution, b = 1,^*0, and 


2 

3 — ft 



We shall consider an illustrative example. In the following table the columns 
f{z) and / 2 (z) are taken from W. P. Elderton, Frequency-Curves and Correlation 
(1906), page 62. f(z) is an empirical frequency distribution, while / 2 (z) is 
obtained by fitting a Pearson Type II curve to the distribution /(z). f x (z) is 
computed from 


U (z) = 1624 S1 ^ ) x = 2.0973 + .80& 


which is determined by the method of this note. /s(z) is obtained by fitting 
the normal curve 


/ 8 (z) = 485.1e 


(*-.4986)* 

2(1.829)" 


2 

fit) 

/.(*) 

hit) 

hit) 

-3 

11 

18 

14 

19 

-2 

116 

107 

109 

92 

-1 

274 

281 

286 

263 

0 

451 

438 

433 

444 

1 

432 

437 

433 

444 

2 

267 

267 

285 

263 

3 

116 

106 

109 

92 

4 

16 

18 

14 

19 


The coefficients of goodness of fit for /i(z), / 2 (z), and fz(z) are respectively 
.35, .58, and .02. 



CONVEXITY PROPERTIES OF GENERALIZED MEAN VALUE 

FUNCTIONS 1 

By Nilan Norris 


Consider the following generalised mean value functions: (1) the unit weight 
or simple sample form, 4>(t) = - —— in which the are posi¬ 

tive real numbers not all equal each to each, and in which t may take any real 

value; (2) the weighted sample form, o)(t) = ^ . - ■■ -ffip V 

in which the c* are positive numbers not all equal each to each, and in which the 

T [ l T- 

Xi and t are restricted as in <£(0; (3) the integral form, 0(f) = / x l dx *, 

LJ*~0 J 

where / x'dx exists for every real value of f; and (4) the generalized integral 
J 0 


form 4>(f) = j -J J*, where yp(x) is a non-decreasing function integrable 

in the Riemann-Stieltjes sense such that ^(°o) — ^(0) = 1, and such that 

I x'dtpix) exists for every real value of t. The facts that all of these func- 
Jx~0 


tions are monotonic increasing and that both <t>(t) and «(t) have two horizontal 
asymptotes have been previously demonstrated. 2 Although the existence of 
4>(t) and w(t) has been known since 1840, there appears to have been no attempt 
made to investigate the behavior of the second derivatives of them. 1 

When the are price relatives, production relatives, or similar data, <f>(t) 
and o>(0 yield common types of index numbers by direct substitution of integral 
values of f. For any values of t such that 0 < t\ < U < 00 , the type bias of 
^(fs) will be greater than the type bias of Similarly, for any values of t 

such that — oo < h < U < 0, the type bias of </>(fi) will be greater than the 
type bias of </>($*). The second derivatives of <t>(t) and w(f) indicate whether 


1 Presented at a joint meeting of the American Mathematical Society, the Econometric 
Society, and the Institute of Mathematical Statistics at St. Louis on January 2, 1936. 
The writer is indebted to C. C. Craig, Einar Hille, Dunham Jackson, and J. Shohat for 
helpful critical reviews of the preliminary draft of this paper. 

* G. H. Hardy, J. E. Littlewood, and G. P61ya, Inequalities (Cambridge University 
Press, London, 1934), pp. 12-16; and Nilan Norris, “Inequalities among Averages,” Annals 
of Mathematical Statistics , Vol. VI, No. 1, March, 1935, pp. 27-29. 

a Jules Bienaym6, Sodbtk Philomatique de Paris } Extraits des proc&s-verbaux des stances 
pendant l’annge 1840 (Imprimerie D’A. Ren6 et Cie., Paris, 1841), Stance du 13 juin 1840 
p. 68. 
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type bias is changing at an increasing or a decreasing rate as between the un¬ 
limited number of averages available for use. Considerable interest attaches 
to «(<), the weighted sample form of function. 

Let u(l) be made arbitrary for the case of n = 2, with x x = 1, and z» = e -x , 
where X is any real number. Also let Ci = a, and ca — 0, where a + 0 = 1. 

Then w(t) = [a + 0e -x< ]‘- Now for all values of t, 

, Q-u 1 /3x. , & t p\* .i , 

« + pe = 1 - y t + ~2 *-g- t + ••• 


For | f | sufficiently small, it follows that 

log (a + 0<f x ‘) = - (3X< + h /3X 2 (l - $)f + 0X‘ I + | - + 

so that for t ^ 0 

j log (a + 0e~ x< ) = - 0X + * 0X 2 (1 - 0)f + jSX* J + | - t 2 + 
Therefore co(<) = exp. log (a + 0e~ Xl )J 

* e _<iX [l + h 0X 2 d - 0)< + 0X 8 {- i f - j + l 0(1 - /3)*x| + • • • J. 

It follows that «"(0) = 2p\*e~ 0X + 0) s xJ. It is clear 

that co(0) is the weighted geometric mean, and that 4>(0) is the unit weight or 
simple sample form of geometric mean. As a means of demonstrating the range 
of values which <*/'(0) may take it is helpful to rewrite the expression for w" (0) 
as follows: 


«"(0) - i 0 2 (1 - 0)V X - e~ 0X m /(X, p). 


This consideration makes it possible to distinguish three cases of y = /(X, p) 
for fixed P, namely, 0 < p < J; p = §; and J < P < 1. In all three cases 
/(X, P) has an absolute minimum n(jp) 0, and n(i) = 0. The corresponding 


values of X satisfies the quadratic equation X* — 


4 4-5/3 


X + 


4 — 80 


= 0 . 


3 0(1 - 0) " ' 0*(1 - 0) 

It is clear that by taking 0 near enough to 0, one can make n(p) as large negative 
as is desired. Also, by choosing X properly, one can make w"(0) take any 
value between j u(P) and ». For example, when a = 0 = $, X may be selected 
so as to make «"(0) any arbitrarily chosen non-negative number. For then 
\* - - 

«"(0) = — e *, and as X increases from — » to 0, «"(0) decreases from « to 
64 

0. If X = 0, co"(0) = 0. If X > 0, as X increases from 0 to 8, «"(0) increases to 
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64e -4 , and as X increases beyond 8, «"(0) decreases, approaching 0 as X increases 
indefinitely. It is evident that the case of a = p * J, with X ■» —log 2, «■ 1, 

and Xs * e _x , is one in which u(t) becomes the unit weight or simple sample 


type of generalized mean value function, namely, 0(0 



Reference 


to the first expression above noted for «"(0) will make clear that 0"(0) = 
(log 2) 4 
64 


y/2 in this special case. 


Analysis of 0(0, the generalized integral form of generalized mean value 
function, makes it possible to characterize populations of a very general char¬ 
acter, as well as samples. But in the case of 0(0 it is even more difficult to 
generalize as to convexity properties. For example, let 



where 


m = -L 

V T 



e~’ a 


do. 


This expression is obviously of the required generalized integral type. Now 


[*(<)]' = 4= r du = 

V 7T Ju —« 



du = e 4 . 


_ g 

Therefore 4>(f) = e 4 , and 4>"(0 = ^ > 0 for all t. That is, in this particular 

lb 

case, <i>(f) has only one horizontal asymptote. 

The foregoing examples indicate that the following conclusions may be drawn 
as to the diverse convexity attributes of the various means as functions of t: 
(1) The unit weight form, <t>(t), and the weighted sample form, w(t), must always 
have a point of inflection, since both of them not only increase with t, but are 
doubly asymptotic (have two horizontal asymptotes). (2) Points of inflection 
for and «(<) do not necessarily occur at t = 0. (3) The generalized integral 
form, 0(0, need not always have a point of inflection. That is, the second 
derivatives of certain forms of 0(0 do not change their sign, since such forms 
are concave upward. 


University or Michigan. 



A SIMPLE FORM OF PERIODOGRAM 


By Dinsmore Alter 

Schuster’s introduction of a method of systematic search for hidden periodici¬ 
ties and cycles opened a new field for the investigator of statistical data. The 
beauty of his method in its analogy to analysis of light, and the great reputa¬ 
tion of its author, combined to give it universal acceptance and to blind statis¬ 
ticians to its faults. 

In more recent years at least three new mathematical and two mechanical 
forms of periodogram analysis have been proposed, each of which exhibits 
certain advantages over the original one. The use of the term periodogram 
for these forms is an extension of Schuster’s original definition which used as 
abscissae quantities proportional to the squares of the amplitudes of the sine 
terms found in the data for the various trial periods. He wrote: ‘‘It is con¬ 
venient to have a word for some representation of a variable quantity which 
shall correspond to the spectrum of a luminous radiation. I propose the word 
periodogram and define it more particularly in the following way: 

rti+T rti+T 

Let %Ta = J f(t) cos ktdt and = Jf f(t) sin ktdt 

2ir 

where T may for convenience be chosen equal to some integer multiple of , 
and plot a curve with ^ as abscissae and r = \/a 2 + b 2 as ordinates; this curve, 

rC 

or better, the space between this curve and the axis of abscissae, represents the 
periodogram of f(t)” 

The following appear to be the essential criteria for a satisfactory form of 
periodogram: 

1. It must exhibit plainly any repetition of form in the data regardless of 
how irregular the shape of the repeated interval may be. In doing this it 
must exaggerate the amplitude of the main terms at the expense of the 
lesser ones. 

2. The calculation of the indices must be short. In a periodogram from 
many data the indices sometimes are computed for several hundred trial 
periods. 

3. There should be a geometrical interpretation of the index used. 

4. The frequency distribution of the index must be known. 

5. Combining or smoothing the data should modify the index in a manner 
which leaves an obvious interpretation. 
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The Schuster periodogram has the following disadvantages: 

1. Only sine tenns of large amplitude are exhibited. A perfect repetition 
of an extremely irregular form of data would not be indicated in any way. 

2. The calculations are long. 

3. There is a considerable uncertainty in the length of the period found. 
Those methods of analysis which use harmonics as well as the fundamental 
have much less of this uncertainty. 

The correlation periodogram has advantages in each of these points over the 
Schuster. However, even with it the calculations are fairly long. Further¬ 
more, the modification of the coefficient introduced by grouping or smoothing 
is not a linear one. 

The periodogram described here is a slight modification of one for which a 
preliminary note was published in 1933. Additional features have been studied 
and its applications to many data have shown its ease of calculation. This 
calculation has been reduced still more by a mechanical method which renders 
it practicable to contemplate the possibility of studying many data hitherto 
prohibited by excessive cost. 

Consider data Xt ,xi ,xt, • • • Xi , • • • X(„_d. Let l be any integer less than n. 
Form the sum of the absolute values of x { — designated by £ | x< — *«-n|. 


n—1| _ j 

Define A = 23 —-, l takes the values of the various trial periods and 

i-i n — l 

is called the lag . A y therefore, is the mean error between prediction that data 
will be repeated after a lag of l and the fulfillment of the prediction. Such 
an index has a meaning that is immediately of use to a meteorologist or other 
investigator. Coefficients such as the Schuster and the correlation coefficient, 
although valuable statistically, are of less immediate interest. 

The standard deviation of these errors of prediction follows at once from 
standard formulae under assumption of normal distribution. 


<t = 1.25 A 


The distribution of <r, as computed from the absolute values of data, has 
been studied by Helmert and by Fisher. Davies and E. S. Pearson have com¬ 
pared the various methods of estimating <r. For the large number, (n — l), 
pairs of data used for a periodogram point, this method becomes almost as 
precise as the usual one which would square the values of (x< — For 

(n — l) as small as 50, the standard deviation of the standard deviation by this 
method is only seven percent larger than by the other one. Fisher has shown 
that 




<r 

y/n —l 



as (n — l) —* oo 


This may be written as 

1.068 g 

9 ' V2(n - l) 





SIMPLE FORM OF PERIODOGRAM 


123 


The distribution approaches normal rapidly and for all values of (n — l) that 
would be used in periodogram calculation certainly may be considered as normal. 
It will be very seldom that a value of (n — l) much smaller than 200 will be 
used. 

The data may be printed on two strips of adding machine tape held together 
by clips so as to match data separated by a lag /. In arranging them for investi¬ 
gation, it usually is most convenient to make all numbers positive. The 
computer subtracts mentally and puts the difference into an adding machine, 
which gives him A almost immediately. 

For some computers, and especially where the numbers are large, another 
method of obtaining A may save time or lead to less numerical mistakes. The 
computer will form the sum of all his data. He will, as for the other form of 
computation, put these on two pieces of adding machine tape that he lays side 
by side. Hcnvever, instead of putting the difference of the pairs into the ma¬ 
chine, he will, in each case, put in the smaller datum of the pair. Then, 

( 7 ? — /) A 1 = 2 2 ah data — [£ 1st (n — l) + last (n — V) data] 

— 2 smaller 

The derivation of this equation is obvious. In computing by this method the 
subtotaler on the machine can be used to make the strip of sums of the first 
(n — l) data and of the last (w — l) for all values of L The first term on the 
right hand side is a constant, the last is twice the sum of the smaller numbers 
chosen in the pairs. I have computed by both methods, and wdiere the numbers 
are small, I prefer the former. Where they are large, I prefer the latter. How- 
ever, wiien one must use comparatively untrained computers, he will find less 
mistakes made if the computer does not make the subtractions. 

The calculation of A is much shorter than that for the indices even of the 
correlation and variance periodograms. It may, however, be shortened even 
more by a mechanical arrangement, (n — l)Ai is the area between two histo¬ 
grams of the data matched after a lag /. These may be carefully graphed on a 
large scale and two such graphs superposed over a table with a translucent 
illuminated top. O 11 the edge of this table is the track to guide a rolling pla- 
nimeter. A, as computed by this means, is accurate to approximately one-half 
of one percent of its value, a much more exact value than is needed. The 
details of such a device as constructed for the Griffith Observatory are showm by 
the accompanying photograph and diagram. The dual saving of time by the 
method and by its mechanical application have resulted in the adoption of a 
much more ambitious program of meteorological research than previously was 
contemplated. 




Planimeter Device for Mechanical Calculation 
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The form taken by the periodogram is important. Consider the simplest 
case, data which follow a sine curve. 

(2wi — c\ 

iji = d COS ) 

Iji - Ui-i = 2 a sin — Tsin AJlfl 

V L V J 

The term in brackets takes values distributed around the circle and the part 
outside is a constant for any one lag. The bracket term sums approximately to 



7T 


- , since we consider all terms as of one sign only. 



If the absolute values were not considered in the expression for A i, the periodo¬ 
gram would be a sine curve of period 2 p. The lack of sign gives a cusp curve 
with the cusp at lags p, 2p, etc. Such a form is advantageous in that the 
periodogram gives sharp peaks at multiples of the periods which may exist. 

The effect of the periodogram in exaggerating the principal terms at the 
excuse of the smaller ones may be obtained most easily by equating a as 
obtained by the linear and the quadratic formulae. 
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The data may be written as the sum of cosine terms 

( 2iri — ip a \ . , / 2 iri — <pb\ 

— p ; -j + feoos^-— j+...+c, 

y< - l/i-i = 2 a sin - 1"sin —----+ • • • + (c< -c,_i) 

Pa L J 

23 (y. — 2/i-O 2 = 2(n — l)a 2 sin 2 — + 2(n — l)b 2 sin 2 — + •••+(« — l) y/2 ot 

Va Pb 

The sine terms contribute to A) in proportion to the squares of their ampli- 

2 irl • 

tudes. On account of the sin — factor, they contribute very little to values 

Pi 

of A i for which — is not very closely an even multiple of ir. 


This method has been applied to rainfall data of the Pacific Coast and has 
proved as satisfactory in practice as would be expected from the simplicity 
of the theory. The periodogram of rainfall stations along the northern third 
of the California coast is shown here, exhibiting perhaps the most definite 
Single piece of evidence ever found for rainfall cycles. Outstanding is a cycle 
of about 45 years with its fourth harmonic as the secondary feature. The 
writer expects to publish the results of that work in the Monthly Weather 
Review. 





ON CERTAIN DISTRIBUTIONS DERIVED FROM THE MULTINOMIAL 

DISTRIBUTION 1 

By Solomon Kullback 


1. Introduction. With the multinomial distribution as a background, there 
may be derived a number of distributions which are of interest in certain prac¬ 
tical applications. Several of these distributions are here presented and the 
theory is illustrated by specific examples. 


2. Preliminary data. In the discussion of the distributions to be considered 
there are needed certain factorial sums whose values are now to be derived. 
In the following discussion only positive integral values (including zero) are 
to be considered. 

There is desired the value, in terms of N , n, r, of 


( 2 . 1 ) 


/„(«, N) = 53 


2V! 


X\\Xt\ • • • x n \ 


where the summation is for all values of Xi , x 2 , • • • , x n such that Xi + x 2 + 

+x n = N and no x is equal to r. 

Let us first consider the case for r = 0; i.e., we desire a value for the sum in 

(2.1) for all values of X\ , x ^, • • • , x n such that x\ + x 2 + • • • + x n = N and 
no x is equal to zero. By the multinomial theorem, we have that 2 

(2.2) («i + <h 4* * * * + a n) N — 52 ZTZTl -“i a V aV * * • <£? 

Xi\x 2 \ • • • x n l 


where the summation is for all values of Xi , x 2 , • • • , x n such that Xi + x 2 + • • • 
+ x n = N. If <zi = a 2 = • • • = a„ = 1, then 


(2.3) 


52 


Nl 


zjzs! • • • x n V 


Xi + x 2 + • • • + Xn * N. 


The sum in (2.3) may however be rearranged into the sum of a number of 
terms as follows: 


ATI 

' x x \x 2 \ • • ■ x n V 


Xi + x 2 -f - • • + x n = N, no x = 0; 


Nl 


(2.4) 


» £ rm- 1 *i + **+•••+ x»-i = N, no x = 0; 

XilXj! • * • X„_i! 

—^ ~ —i' Xl + **+••• + Xn -1 = N, noX = 0; 

l Xil Xtl • • • X n »2l 


\r) ^Xil Xil ••• X n - r l’ 


Xl + Xt + ■ • • + Xn—r = N, no X = 0. 


1 Presented to the Institute of Mathematical Statistics January 2, 1936. 

* H. S. Hall A 8. R. Knight, Higher Algebra , MacMillan A Co., 4th Ed. (1924), Chap. 15. 
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Thus we may rewrite (2.3) as 

n* — ft(n, N ) -f- Tifo(n — 1, N) 
(2-8) n(n-l) 


+ 


2 ! 


/o(n — 2, N ) ■+■••• + 


(”)/«(» - r, 


N) + 


Replacing n by n — 1 in (2.5) there is obtained 
(n — l) w = /o(n — 1, N) 

( 2 . 6 ) 


+ (n - l)/o(n - 2,N) + ••• + 


(”;V 


r- 1,N) + 


Multiplying (2.6) by n and subtracting the result from (2.5), there is obtained 
n N — n(n — 1)* = /o(n, N) 

(2.7) n(n — l) 


2! 


!/o(n - 2, N) -r ( r ” j)/ 0 (n - r - 1,210 - 


Replacing to by ra — 2 in (2.5) there is obtained 
(n-2) w = /o(n-2,i\0 

+ (n — 2)/o(n — 3, AO + * * * + _ ^/o(to — t — 1> N) + * * * 

'Multiplying (2.8) by n(n — l)/2 and adding the result to (2.8), there is obtained 
n N - n(n - l) w + w(w ~ 1} (n - 2)" = /„(n, N) + 


(2.9) 


2! 


3! 


/o(n-3,AT) + ... +~ 7 ^( r ” j)/o(n - r - 1, AO + 
Continuing this process, there is finally obtained the result that 
(2.10) /o(n, N) = n N — n(n — l) v + ^ (n — 2) y — • • • =fc n* 1* 


It may be shown 8 that the right side of (2.10) is A n x N for x = 0. The author 
has felsewhere obtained (2.10), but by a special procedure not applicable to the 
general case. 4 

We may readily verify (2.10) for example, for n = 3, JV = 5. If X\ + X 2 
+ x% = 5 and no x = 0, then the sets of solutions are (3,1,1), (1,3,1), (1,1,3), 

(2,2,1), (2,1,2), (1,2,2), and / 0 (3,5) = 3-^-, + S^rllT! “ 150 ' Fr0m (210) 
there is obtained/ 0 (3,5) = 3 6 - 3.2 6 + 3.2/2 = 150. 


* E. T. Whittaker & G. Robinson, The Calculus of Observations , Blackie & Son Ltd. 
(1924), p. 7. 

4 S. Kullback, “On the Bernoulli Distribution/* Bull. Am. Math. Soc., December, 1935. 
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For the general case, we return again to (2.3) and rearrange the right side 
into the sum of a number of terms as follows: 


• • • Xn\’ 

n ^ ATI 
r! " xi\x t \ • • • 

* 2,11) ' n(n- 1)^ Nl 
2!fr!V> ^ 


xi + & + • • • + x n = no x = r; 


*! + **+•••+ Xn-i = N — r, no * * r; 


no *= r ; 

.(*)(*) * + a+- + ^-W-*r, 

Thus we may rewrite (2.3) as 


( 2 . 12 ) 


Tkr( f ) 

n* = / r (n, AT) + —p/ r (n — 1, JV — r) 


+ !! ^ ! Wr” / ’ (n - 2 ’ Ar - 2r)+ 

where JV (t) = 2V(JV - 1)(JV — 2) • • • (iV — fc + 1). 

Replacing nby n - 1 and N by N — r in (2.12) there is obtained 

(n — l) w_r = f T (n — 1,N — r) 

( 2 . 13 ) r n _ i)(\t _ r \M 

+ —.~^r r—^—fr(n -2,N-2r) + 


Multiplying (2.13) by —r- and subtracting the result from (2.12), there is 

r! 

obtained 

(2.14) n N - ^ (n - l)*“ r = /,(n, IV) - J^/,(» -2,N-2r) - 

By continuing this process, in a manner similar to that used for the case r = 0 
there is finally obtained 


* nAT 


, n(n — 1)AT (I 


(2.15) 


//_ AT\ (* i\*-r I n '"' ± ' 1 ' 

f r {n, N) = n - — («-!) + -g f fr j)T 


(n-2)' 


/n\lV (3r) , 

-Ww ( ’ + 


By setting r = 0 in (2.15), there is of course obtained the value already 
found in (2.10). 

We may readily verify (2.15) for example, for n = 3, IV = 5, r = 2. If 
+ *3 — 5 and no x = 2, then the sets of solutions are (5,0,0), (0,5,0), 
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(0,0,5), (4,1,0), (1,4,0), (1,0,4), (4,0,1), (0,1,4), (0,4,1), (3,1,1), (1,3,1), (1,1,3), 
and/,(3,5) 3-51/5! + 6-51/4! + 3-51/3! = 93. From (2.15) there is ob¬ 
tained/,(3,5) - 3‘ - 3-5-4-2*/2I + 3-2*5*4-3 *2/21(2!)* « 93. 

The same method of procedure may be applied to evaluate 


(2.16) fn -‘- ,(n ’ N) ~^ xilxil • • • z»!’ 


Thus, there is derived the result that 


*i + ®, + 4- *» =* AT, 


no * <* r,«, 


», N) = n w — n^ : 


N lr) (n - l) N ~ r , N M (n - 1/ 


+ n(n — l)i 


'N ar) (n- 2)"- ir , N (r+,) (n — 2) w_r ~* 


(2.17) 


2! (r!) s 

N (M (n - 2) K ~ U \ 

2! («!)* ) H 

iV ttr+,) (n - 3) W - Jr ~* 

2! (r!)* («!) 


n(n — l)(n 


(r!)(«!) 

^/N (,r) (n - 3)"- tr 
2) \ 3!(r!)» 


N ir+U) (n - 3) N ~ 
2! (r!) («!)* 


N iu) (n - 3)* 
3! («!)* 


We may readily verify (2.17) for example, for n = 3, N = 5, r = 0, s = 2. 
If + Xt + x t = 5 and no x = 0 or 2, then the sets of solutions are (3,1,1), 
(1,3,1), (1,1,3) and /o*(3,5) = 3-51/3! = 60. From (2.17) there is obtained 
/<#(3,5) = 3* - 3(2 s + 5-4-2 s /2) + 3-2(1/21 + 5-4/2! + 5-4-3-2/(2!)’) = 60. 
It will be shown later (see section 8) that 


/,(», N) = /„(», N) + f T .(n - 1, N - s) 


(2.18) 




(2.19) 


nN (r) 

/.(», AT) = /,.(», AT) + / r .(n - 1, W - r) 


, »(» - l)JV (Sr) . , 0 v . . . 

■i-2! (r!)*— W ~ 2 > A^ - 2r) -f 


From (2.18) and (2.19) there may be derived, by a method similar to that 
employed in deriving (2.15), that 


/«(», A0 = /,(», AT) — —p- / r (n - 1, AT - «) 


( 2 . 20 ) 


, n(n — 1)AT ( ** ) n xr _ 

+ 2 ! («!)* ” 3 ’ ** 28 


This latter result also follows from (2.17 and (2.15). 
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Let us now consider the following generalization of (2.1). There is desired 
in terms oi N, n, r, ai, at, • • • , a„, the value of 

//% ix \ n / »r \ X ^ N\ Xt X« Xm 

X fV/fr, if , U1, uj, ■ • - , %*nj = ^ ~ ~~ r~~T -—i »1 WJ ’ ' <*» 

XliXs! •••£»! 

where ai, a*, • • • , a n , are constants and the summation is for all values of 
Xi,Xt, • • • ,x n such that Xi + xt + • • • + x n = N and no x = r. The method 
of procedure is the same as that for the case already considered, viz when 
ox = oj = • • • — a„ = 1. 

The sum in (2.2) may be rearranged into the sum of a number of terms as 
follows: 


Z —i —r~ -i °** «s J • • • a X nf X1 + X1+ -h*n = N, nox = r; 

Xil x*! • • • x„l 


Olv 1 HI ... i On Y' Ml „_ 

H^x^-.-Xn! 01 + + 7!^^VV^j 01 

xi + xt + • • • + x„_i = N — r, etc., no x = r; 

(2.22) j ;.■;.. 

^ Z —^—. ojK-V • • • «;■ + • • • 

(r!)* ^ x* +1 ! • • • x n ! 

I Q»-*+l • • • On Y> Nl *,_„*•-» 

I / iu t f ®n~fc > 

(r!)* xi! • • • Xn-*! 

xi + + • • • + x»_* = N — hr, etc., no x = r; 


/r*l . . . /***“! 
ai • • a»-i, 


For convenience, let us write 


A(n , JV) = + a 2 + • * • + a n ) N 

Ai{n — 1, N) = (ai + • • • + fl*-i + a»+i + • • • + <*«)* 

A tJ (n — 2, iV) = (ai + • • • + Ui-i + a»+i + • * * + + oy+i + • • • + a») y 

(2 ‘ 23) ]G,(n, N) = F r (n, N, Oi, a*, • • • , a„) 

G>(n — 1, 2V, a<) = F r (n — 1, N, a x , at, • • • , a,_i, a,+i, • • • , o„) 

(?r(n - 2,2V, a<, a,) = F r (n - 2, N, Oi, • • •, a<_i, o,+x, • • •, o,_i, a,+i, • • •, a„) 
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From (2.24), there are obtained n equations 


.4<(n — 1 ,N — r) =* Grin — 1 ,N — r,a<) + 


(N - r) ( 


(2.25) 


a T jG r (n - 2, N - 2r, a<, a,) + • • • (* = 1,2, • • •, n, jV 1) 


Multiplying (2.25) by a\N {r) /r\ and subtracting the result from (2.24), there 
is obtained 


(2.26) 


* r *r(r) 

A(ti,N) — 53 —"—;—-d<(» — 1,JV — r) = G r (n, N) 
»-i r! 

JV Wr) A 

- 2 i ( r |)« .2 oIaJ(? r (n -2,1V- 2r, a,, a,) - 


(i y* j, etc.). 


Continuing this procedure, there is finally obtained 

N m 

G r (n,N) = F r (n,N, «i, at, • • •,o„) = 4(n,IV) - —y- 
(2.27) £ a r^<(n - 1, IV - r) + ^ . 2 £ oIa, r ^,,(n - 2, IV - 2r) - 

t—1 Vw t.J—l 


(t 5* j, etc.) 


Similar results are obtainable for 


(2.28) G r ,...t — Fri---t(n, N,a i, at, • • •, a n ) = 53 —r—r-i • • • ®»“ 

Xi! X*! • • • x„! 

where the summation is for all values of x, such that Xj + x 2 + • • • + x„ = IV, 
and no x = r, s, ■ ■ • , or t. 

Thus, it will be shown later (see section 8), that 


(2.29) 


Grin, N ) = G r ,(n, N) + —r- 53 a'Grsin - 1, N - s, o<) 

S! i-i 

“h o 7 / i\ 2 23 fl» 0 )'G r «(n. 2, IV 2s, a,, a,) -f- 

2! (s!)- 5 t.,-i 


(* t 6 j, etc.) 


Corresponding to the derivation of (2.27), there is obtained from (2.29) 
the fact that 


(2.30) 


N (,) A 

Gr,(n, N) = Grin, IV) - —p 53 o‘<Gr(n -l,N-s,a { ) 

8 ! 

AT (2,) JU 

+ 2 f(s !)2 — 2, iV* — 2s, a»*, a,*) — • • ■ 


(t ^ j, etc.) 


3. The problem to be studied. Consider a trial in which one of n mutually 
exclusive events may occur, with the respective probabilities of occurrence 
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Pit Pt> •••»?» where pi + ps + ■ • ■ + p„ = 1. The probabilities of the 
various combinations of events which are possible in N trials are given by the 
terms of the expansion of (pi + Pj + • • • + p n ) y - 

In the N trials some of the possible events may not occur, others may occur 
one, twice, etc. It is desired to study the distribution of the number of events 
which do not occur; the distribution of the number of events which occur once 
each, etc. The simultaneous distributions of the events above described are 
also to be studied. 

For example, the possible event may be the occurrence of a digit. A study 
of a sequence of random digits, in sets of ten, yielded the following three 
sample sets. 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

1 

0 

2 

1 

1 

2 

1 

0 

0 

2 

1 

1 

1 

1 

1 

1 

2 

0 

1 

1 

0 

0 

2 

1 

2 

1 

2 

1 

0 

1 


Fig. 1 


In the first set three events do not occur, four occur once each, and three occur 
twice each. In the second set one event does not occur, eight events occur once 
each, and one event occurs twice; etc. 

4. Distribution of the number of events not occurring. To obtain the distri¬ 
bution of the number of events which do not occur, there is applied to the 
expansion of (p t + ps + • • • + p n ) y a procedure similar to that employed 
in section 2. 

Thus, if avo represents the probability for r events not occurring, then 


Too 


N\ 


— 2 £j! jjt . x ! P *' P** ' " P*”' Xl "t" ^ + ' ‘ ’ + — N, 

no x = 0; 

1 Tl ° = 2 x^^J P? ' ‘ ’ + ' ‘' + 2 x x ! N -'x n -,\ pV '" ’ 

* Xl + xt + • • • + x n -i = N, etc., no x = 0; 


TrO 


_ V — 7> Xr V ■ • • n*" -I_4 -Y - Nl nV . .. 

~ ^ x r+1 ! • • • xj PrTl Pn + + ^ xi! • • • Xn-r ! Pl Pn ~ r ’ 

+ • • • + *»-r = N, etc., no x = 0; 
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Employing (2.21), we may write (4.1) as 
Too * Fo(n, N, Pi, Pi , • • • , p») 

(4.2) Tlt = F ^ n “ N > P* * ‘ • • »P») + • • • + Fo(n - l,N,pi,p», •• •, p_i) 

(*vo = ^o(n - r, N, p r +i, , Pn) + •■• + F 0 (n - r, N, p x , • • • , p„_) 

Since pi + p* + • • • + p» = 1 there is found from (2.27) that 


(4.3) 


*■00 -1 — 2 (i - ViY + 53 (i - pi - p,Y 

t-1 t,;-l 

- ^ d - - Pi - PkY + • • ■ 

*•10 = 2 (1 ~ PiY - 53 (1 - Pi - Pi)" 

* 

+ 5] 53 (i — Pi — Pi — PkY — • • ■ 


t-1 


<.1-1 


irao 


= ti { 53 (i - Pi - Pj)* - 53 (i - p, - p, - p*) w + • • •} 

(t ^ j, etc.) 


(1 - P* - p; - P.’"-} 


The factorial moments 6 of the distribution given by (4.3) are easily derived. 
The first factorial moment is given by <j\ = it i 0 + 2t2o + 3x 3 o + * • • + rwro + • • • 
and the summation of the proper terms in (4.3) yields 


(4.4) 


- £ a - ViY 


t-i 


In general, the r-th factorial moment, given by a r = £ Hk — 1) 

fc—r 

(fc — r + l)w*ois 


(4.5) Or — 23 (1 — p« — Pi — • • • — PrY, (o 5^ b, etc.). 

o,b,* • *,r—1 

Indeed, (4.3) illustrates the fact that, if f(x) is the probability that a discon¬ 
tinuous variate takes the value x f then 8 


(4.6) 


fix) = — £ (~ l) k <Tz+k/k\ 

XI k-Q 


‘ J. F. Steffensen, Interpolation (1927), p. 101. 

6 J. F. Steffensen, “Factorial Moments and Discontinuous Frequency Functions** 
Shandinavisk Aktuarielidskrift , Vol. VI (1923), pp. 73-89. 
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The moments about any constant of the distribution given by (4.3) may be 
derived from the factorial moments by the relation 7 

(4.7) E(x — Q,) r = (1 + <tiA +.<t2 A 2 /2! + • • • + a r A r /r\) • £ r ({ = — a) * 

where A is the difference operator of the calculus of finite differences, and £ 
is replaced by (—a) after the indicated operations have been performed. 


Of special interest is the case when pi 


P2 = 


= p n = for which (4.3) 


becomes 


(4.8) 


i . Q «».«=(iyw 


**r0 



fo (n - r, N ) 



A n_r 0 V 


where f 0 (n, N) and A"0 W are as defined in section 2. The probabilities in (4.8) 
are the respective terms of the expansion of ar (1 +A) n .O". 

For this case the r-th factorial moment becomes 


(4.9) o> = n(n — 1) • • • (n — r + 1) (n — r) N /n N 

There is presented an example of the distribution (4.8) for the case n = N = 10. 


It is found that 8 







AO 10 

= 

1 

a 8 o 10 = 

16435440 


a 2 o 10 

= 

1022 

AV° = 

29635200 

(4.10) 

a 3 o 10 

= 

55980 

a 8 o 10 = 

30240000 


a 4 o 10 

= 

818520 

a 9 o 10 = 

16329600 


a 6 o 10 

= 

5103000 

A 10 0 10 = 

3628800 


**oo 

= 

.000362880 

ttbo = 

.128595600 


**10 

= 

.016329600 

**60 = 

.017188920 

(4.11) 

**20 

= 

.136080000 

7T70 = 

.000671760 


**80 

= 

.355622400 

*"80 = 

.000004599 


**40 

= 

.345144240 

**90 ~ 

.000000001 

/A 1 0\ 

0*1 = 

3.486784401 

m — 

3.486784401 

G% — 

9.663676416 

2 

O = 

0.992795358 


7 This result is derived as follows: (x — a) r «■ (1 + A)*- ( —a) r ; E(x — a) r — (* *■* °) r 

*-i 

/<*) - (s (1 + A)* /(i))(-a)' - (g (1 + XA + x(x - l)A*/2! + • ••)/(*)) (-a) r - For 


a bivariate distribution it may be shown similarly that, symbolically, E{(x — a) r (y — &)•) 
** jexp(<n. Ai + i A a )| • (—a) r (—6) • where « <r mn and Ai operates only on o and As 

operates only on 6. A similar result may be derived for a multivariate distribution. 

8 cf. Whittaker A Robinson, op. cit. p. 7. 
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The observed distribution was obtained by distributing 200 sets of ten digits 
each, the digits being found in Tippet’s Random Sampling Numbers.' The 
results obtained are given in Fig. 2. Three of the 200 observed sets were 
illustrated in section 3. 

The agreement between observed results and theoretical values is gratifying. 


5. Distribution of the number of events which occur once each. Let , 

represent the probability that there are k events which occur once each. Thus, 
the various probabilities, obtained by rearranging the terms of the expansion of 
(pi + Pi + • • • 4- Pn) K , are as follows: 


7T 01 


IT u 


-E 


N\ 


Pl 1 Xi + Xi + ••• +X n = N, no X = 1 J 

AT! 


x t ! • • • x„! 

- ■•■!*+• + p- £ 


pV • • • p»--i, 


(5.1) 


Xi + xj + • • • + x„-i = N — 1, etc., no x = 1; 

ir*i = pipt ■ ■ ■ pk 2- v— -j Pm ’ • • P» n + • * • + Pn-k +i ••• Pn 

Xk+il *• • X n l 

xi + x 2 + • • * + x n -k = N — Jc, etc., no x = 1; 


No. of events 
not occurring 

X 

Observed 

frequency 

/ 

Theoretical 

frequency 

xf 

)} 

Observed 

parameters 

0 

0 

0.08 

0 

0 

< 7 , = 3.46 

i 

8 

3.26 

8 

0 

02 = 9.61 

2 

22 

27.22 

44 

44 

x = 3.46 

3 

72 

71.12 

216 

432 

s = 1.0984 

4 

72 

69.02 

288 

864 

Theoretical 

5 

21 

25.72 

105 

420 

Parameters 

6 

4 

3.44 

24 

120 

<7i - 3.49 

7 

1 

0.14 

7 

42 

<r 2 = 9.66 

8 

0 

0.00 

0 

0 

m = 3.49 

9 

0 

0.00 

0 

0 

<7 2 = 0.99 


200 

200.00 

692 

1922 



Fig. 2 


• L. H. C. Tippet, Random Sampling Numbers, Tracts for Computers, No. XV (1927), 
London. 
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In view of (2.21) and (2.27), it is found that (5.1) becomes 
*01 


(5.2) 


Til 


1 - N p<(l - PiY~ l + ^2f ~ PtPitt ~ Pi - Pi)"~* - 

yjg p.d - PiY~ i - (n - 1) i: viPitt -Pi- p ^+ •••} 


**1 = 


N(N - 1) 
2! 


it 

U.y-i 


Pi Pitt - Pi - p>y 


at-2 


■} 


(t 5 ^ j, etc.) 


From (5.2) there is readily derived the fact that 
<j t = N(N - 1) • • • (N - r + 1) 

(5.3) » 

P°Pb ■ • * Pr (1 - p« - Pb- 

a,b,-' *. r*“l 


PrY f , (o ^ 5, etc.) 


For the case in which pi = P 2 = • • • = p n = -, the distribution in (5.2) 


becomes 


(5.4) 


*01 = Q Y /i(n, JV) 

= Q" niV/i(rc — 1,2V — 1) 


*21 


*rl = 


n(n - l)N(N - 1) 


Mn - 2, N - 2) 


'l\Vn W(r , 


llV ( 7i(« — r, N — r) 


where fi(n, N ) and A r(r) have been defined in section 2. For this case (5.3) 
becomes 


(5.5) ff r = n <r) N {r \n - r) A '“VY 

Evaluation of (5.4) and (5.5) for n = N — 10 yields, 


(5.6) 


fa-oi = .00811639 
jTn = .04794633 
7T2i = .14082336 
x,i = .21089376 


X41 = .27052704 
irji = .15621984 
m = .12700800 
*7i = .02177280 


*8i = .01632960 
*., = . 00000000 “ 
*101 = .00036288 


<ri = 3.87420489 
10 <t, = 13.58954496 


m = 3.87420489 
c = 2.45428632 


10 For the case n * AT — 10 there cannot be 9 events occurring once each, since then the 
tenth event must also occur once. 
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The observed distribution, given in Fig. 3, was obtained from the 200 sets 
previously considered. 

The agreement between the observed results and theoretical values is 
gratifying. 

6. Distribution of the number of events which occur r times each. Let 

Vkr represent the probability that there are fe events occurring r times each. 
Thus, the various probabilities, obtained by rearranging the terms of the ex¬ 
pansion of (pi + pj + • • • + p n ) N , are as follows: 


No. of events 
occurring 
once each 

X 

Observed 

frequency 

Theoretical 

frequency 

xf 


Observed 

parameters 

0 

i 

1.62 

0 

0 

d\ = 3.905 

1 

10 

9.58 

10 

0 

at = 14.000 

2 

30 

28.16 

60 

60 

£ = 3.905 

3 

37 

42.18 

111 

222 

s ! = 2.656 

4 

62 

54.10 

248 

744 

Theoretical 

5 

27 

31.24 

135 

540 

Parameters 

6 

22 

25.40 

132 

660 

a i = 3.874 

7 

3 

4.36 

21 

126 

at — 13.590 

‘ 8 

8 

3.26 

64 

448 

m = 3.874 

9 

0 

0.00 

0 

0 

a 2 = 2.454 

10 

0 

0.08 

0 

0 



200 

199.98 

781 

2800 



Fia. 3 


JTOr 




N ! 


Xil • • • x»! 
N\ 


pV • • • p«", xi + %+•*•+ x„ = N, no x = r; 

P r n v- N\ 


Tlr ~ rl^ Xtl ■ ■ ■ x n \ Pt ”‘ Vn + + r! ^x t ! • • • x„_i! Pl 

X\ + Xi + • • • + x B -i = IV — r, etc., no x = r; 




PiP*••• p* 
(r!)* 




Xi+i! 




+ 


xj 
P»-4+l 


(r!)* 


• Pn + 


Nl 


Xi! • • • x n _*! 


PP 


• P^*, 


xi + xj + • • • + Xn_t = N — kr, etc., no x = r; 


( 6 . 1 ) 
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In view of (2.21) and (2.27) it is found that (6.1) becomes 

7 <r) * \ r(2r) n 


( 6 . 2 ) 


%T\r) n xrtSr) n 

- - 1 - 7t s” :a - *r + £ p:p:a - * -*>■“-••• 

= IT {§ p:(1 - - ! ^7T^ j* ^ • • •} 


Fir 


F*r 


AT™ f 


2 ! (r!) 1 




From (6.2) there is readily derived the fact that 
N ikr) 


(t j, etc.) 


(6.3) <T* = 2 P'aPb'” P*(l ~ Pa ~ Pb - P*)* (o J* 6, dtC.) 

v •) o,b •*,fc —1 

For r = 0,1 (6.2) and (6.3) reduce to the values previously derived. 


For the case in which pi = P 2 = 
becomes 


= p» = -, the distribution in (6.2) 
n 


(6.4) 


7T0r 


TTir 


- (oo.<»,« 




where / r (n, AT) has been defined in section 2. For this case (6.3) becomes 
(6.5) a* = N (kr) n W) (n - k) N - kr /n N 

7. Simultaneous distribution of the number of events not occurring} and of 
the number of events occurring once each. The probabilities for the simul¬ 
taneous occurrence of the various combinations of the number of events not 
occurring, and of the number of events occurring once each, are given by rear¬ 
ranging the terms of the expansion of (pi + P 2 + • • • + Pn) N , and are given 
as in Fig. 4. 

In Fig. 4 none of the subscripts take on equal values simultaneously, and Got 
has been defined in section 2. Summation of the values in the fc-th column 
of Fig. 4, yields the probability that there are (ft — 1) events not occurring. 
Comparison with (4.2) yields 

n 

Fo(n,N, pi, pa, • • •, p„) = (?o(n, N) - (7oi(n, AO + N £ PtGain - 1 ,N - 1, pi) 
(7 -« . 

+ -xr £ P. Pf <? 0 i(n - 2, AT - 2, p,, p,) + • • • , (t ^ j, etc.) 
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Number of ereata not oocurrinf 

0 

1 

f 

Number of events occurring once each 

0 

G n (n, N) 

2<? M (n-l,lV,Pi) 

i-i 

... 

I 

Nj^PiG m (n-l,N-l, P i) 
*-1 

iv 53 p <<? “ 

*,j-i 

(» — 2, AT — 1, p<, Pi) 

... 

2 

AT<*> V' 

9i'“ PiPiir oi 

M-l 

(n - 2, IV - 2, Pi p,) 

AT (,) ^ - 

(n-3,JV-2, p,,p/,p*) 

... 

8 


i 

! 

Af (<) y 

a,b »• • •,«,« 

PoPb • • * p«(?oi(n —• r — 

AT-a, p., •••,?„ 
P«» •••»P#) 


Fio. 4 


Summation of the values in the fc-th row of Fig. 4, yields the probability 
that there are (fc — 1) events occurring once each. Comparison with (5.2) 
and (2.27) yields 

n 

Fi(.n,N, pi, Vt, • • • ,Pn) = Gi(n,N) = Goiin,N) + 23 Goiin — 1,iV, pi) 

»-1 

+ ii 2 <?»i in - 2, N, p<, pi) + •••, (» * j, etc.) 

»,2-l 

If we use x to represent the number of events not occurring, and y the number 
of events occurring once each, then it is found that 

Eix M y M ) = a r . = N M J2 PaPb • • • p.il - Pa 

“, 6 ,* • •, a,a, fir • ‘,p~ 1 


(7.2) 


(7.3) 


P. 


- Pa - • • • - p,) N *, (a ^ b, etc.). 

If qXu represents the average number of events not occurring, when there 
are k events occurring once each, then from Fig. 4 there is found that 

23 Goxin - 1, N, Pi) + 2 £ Gotin - 2, N, p it p,)/2! 

1—1 t,7—l 

n 

+ 3 2 Goi(n 3 } N, pi t pj f pic)/ 3 ! + • • • 

__ 

Goi(n, IV) + £ Goiin - 1,2V, p<) 

»-l 
n 

+ 2 <7oi(n - 2, iV, p<, p/)/2! H- 


(7.4) oxoi = 


(i 5^ j, etc.) 
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In view of (7.2), (7.4) reduces to 


(An = (g Gi(n, N, p,)) / Gi(n, N) 


A similar procedure, yields, in general 

n 

23 PaPb • • • PhGi(n — k — 1,N — k,p a ,pb, • • • ,Pk,pi) 

(7.6) o£» = -- 

23 PaPb • • • PkG\{n — k,N — k,Pa,Pb, • • • , Pk) 

o, 6 ,* ‘ -, Jb —1 

(a 5 * 6, etc.) 

If i#*o represents the average number of events occurring once each, when 
there are k events not occurring, then from Fig. 4, there is found that 

(£: pMn -l.N-l, Pi) + m - 1) 


(t ^ j, etc.) 


23 PiPjGoiin - 2, N - 2, Pi, pi)/ 2! + ••• 

(7.7) ij/oo - -**=*-;- 

Goi(n, iV) + N g p^GWn - l,iV - 1, p.) 


+ N w 23 PiPiGnin -2,N-2,pi, pj)/2\ 


In view of (7.1), (7.7) reduces to 

(7.8) tfoo = (tfg p,G 0 (n - 1, N - 1, p<)) / Ga(n, N) 

A similar procedure, yields, in general 

n 

IV 23 PaGoin — k — \,N — \,p a ,Pb, ,Pk, pi) 

(7.9) i&„ = ' -(a ^ b, etc.) 

23 Gain — k, N,p a , Pb, • • • , Pk) 

a,b,' • -.fc—l 

For the case in which pi = p 2 = • • • = p n = -, as may be found from Fig. 4, 

n 

the probability for the simultaneous occurrence of r events not occurring, and 
8 events occurring once each, is given by 


/l\^ n (r+i> 

(7.10) (-j rl<| U(n — r — s, N — s) 

For this case (7.1), (7.2), (7.3), (7.6), and (7.9) yield respectively 

(7.11) /o(n ’ N) = /oi(n ’ N) + nNfoiin + (t) Nm M n - 2 > 

N - 2) + 

(7.12) /i(n, iV) = foiin, N) + n/ 0 i(n — 1, N) + (”)/oi(n - 2, N) + • • • 

(7.13) <r„ = N (,) n (r+, \n - r - s) y -/n" 


For this case (7.1), (7.2), (7.3), (7.6), and (7.9) yield respectively 


(7.11) 
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<7.14) oli« - (n - k)fi(n - k - l, N - k)/h(n - k, N - k) 
<7.15) = N(n - k)Mn - k - 1 , N - 1 )//,(n - *, JV) 


Let us consider again the case when pi = p* = • • • = 
Evaluating (7.14) and (7.15) by means of (2.15) yields 


= p„ = - and n = N = 10. 
n 


<7.16) 


<7.17) 


The 200 sets of observations already considered yielded the simultaneous 
distribution given in Fig. 5. 
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The distribution in Fig. 5 yields 9 U = 11.89, (7.13) yields <r u = 12.07959552. 

The agreement between the observed results in Fig. 5 and the theoretical 
values in (7.16) and (7.17) is gratifying. 

8. Simultaneous distribution of the number of events which occur r times 
each, and of the number of events which occur s times each. The probabilities 
for the simultaneous occurrence of the various combinations of the number of 
events which occur r times each, and of the number of events which occur s 
times each, are obtained by rearranging the terms of the expansion of (pi + pj 
+ • • • + Vu)". If Tkr.u is the probability for the simultaneous occurrence of 
k events which occur r times each and l events which occur 8 times each, then 

j^Ocr+lt) 

(8.1) VkT ' u = ) fell! WW 

(n — k — l, N — kr — Is, p a pk, p a , Px), (o ^ 5, etc.) 




-,x-i 


PkPa 


plG ri 


where G„ is defined in section 2. 

From (8.1) and (6.2), there is derived, in a manner similar to the derivation 
of (7.1) and (7.2), the result that 


<•) » 


( 8 . 2 ) 


F r (n,N,pi, •••,?„) = G r (n,N) = G„{n,N) + ^- 23 PiG r .(n - 1, N -s, p<) 

s! <-i 


N 


(2») n 


2! (s!) 2 , 


23 Pi v‘i Grtin - 2, N - 2s, p<, pi) + • • •, (t ^ j, etc.) 


and a similar result by interchanging r and s in (8.2). 
For the distribution given by (8.1), it is found that 


(8.3) ^ = (HFM‘ , ..6,--.tS,--.x- 1 


r r « « 

Pa PkPa ’ * - PX 


(1 - Pa~ • * * ~ Pk~ Pa ~ 


Px)' r “ *, (a ^ b, etc.) 


If rXu represents the average number of events which occur r times each 
when there are l events which occur s times each, then from (8.1) and (8.2), 
in a manner similar to the derivation of (7.6), it is found that 


r£lt 

(8.4) 


( N-k) M 2 p r aPa-'-p\G.(n-l-l,N-r-l$,pa,Pa,--- ,Px) 

_ «,«,•« sX—l ___ 

r! 23 Pa-" PxG,(ft - l,N - Is, Pa, ••• ,Px) 

a.-A-l 


(a ^ 0, etc.) 
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If t&kr represents the average number of events which occur s times each 
when there are k events which occur r times each, then by interchanging k and l, 
and r and s in (8.4), there is found 


(N — kr) w ± p r a“'PkPaG r (n-k-l,N-kr- 8 ,p a ,--',pk,p*) 


(8.5) 


£ pl’“ PkG r (n — k,N — kr,p a , • ,Pk) 


(o 9 * b, etc.) 


For the case when pi = Ps = • • • = P» = -, it is found that (8.1), (8.2), 

n 

(8.3), (8.4), and (8.5) respectively yield 

Q " n ( *+ ,) v (tr+,,) 

k\l\(r\) k (s\) lfr ^ n ~ k ~ l > N ~ kr 


(«) 


(8.7) 


-wAT 1 

fr(n, N ) - / r «(n, N) + T ~~ f t ,(n - 1, N - «) 


, »(« - l)N (i,) . . . „ _ . , 

2! (s!) 2 ^ r ' n 2 ’ ^ 28 


(8.8) <r tI = n (k+l) N (kr+l ‘\n -k- l) N ~ kr - l ‘/(r]) k (s!)‘ n" 

(8.9) r ^/» = (n — Z)(JV — Zs) <r> /«(w — 1— l, N — r — ls)/r\f,(n — l,N — Is) 

(8.10) ,#*,■ = (n — k)(lV — kr) U) f r (n — k — l,N — kr — s)/a\f r (n — k, N — kr) 


For r = 0, s = 1, the results derived in this section of course reduce to those 
already derived in section 7. 


9. Conclusion. It is clear that the same method of procedure may be em¬ 
ployed to study the simultaneous distribution of the number of events which 
occur r, 8, • • • , t, times each. However we will not continue the discussion 
any further. 

We have thus seen that the multinomial distribution serves as the back¬ 
ground for the study of a number of distributions which have certain practical 
applications. 

The theory discussed herein has been illustrated by several examples which 
yielded gratifying agreement between observed and theoretical results. 


Washington, D. C. 



A PROBLEM IN LEAST SQUARES 

By Jan K. Wisniewski 

§1. We are dealing with two variables, the observed values of which are 
denoted x and y respectively. The pairs of observations are divided into r 
groups, numbering ni,nt, • • • rw pairs. Suppose in each group we determine a 
regression equation of the following shape: 

Vi = <*« + + • • • mix' (1) 

where y< denotes the value of the “dependent” variable obtained from the 
regression equation, while y without any subscript denotes its observed value. 
The r regression equations of type (1) are not assumed independent; on the 
contrary, we postulate that 

r 

Y yi = Oo + box + • • • wioX* ( 2 ) 

1 

be fulfilled identically in x; oo, b 0 , • • * mo being predetermined numbers. This 
leads to the following conditions: 

r * r r 

ai = oo = bo • • • 23 mi = mo, (3) 

1 1 i 

The magnitude to be minimized under the theory of least squares is now 
z = Y Yi [y - (a,- + biX + • • • m.x ')] 2 + Yr\y - ^ao - Y 

X + • • * ~ Y m i 

The normal equations derived from (4) are of the following shape: 




rifli + n, £ a, + 6 , Yi x + (Y b^j (Yr x) + • • • m, Yi x" 

4" ^ ( i^r X ) — y y 4* njlo -j- 6o X * * Wo ‘Yh? X 


(5) 
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O/ Z> X + (l2 (Zr *) + &/ Z> ** + (Z &<) (Zr X S ) 

( f-1 

Z W»j 

+ &o Zr X* + • • • mo Zr * ,+1 

;;;;;;;;;;;;;;;;;;;; ;*;;;;;;;;;;' /;;;;;;;;;;;;;;;;;; (5) 

0/ Z> x‘ + (z (Zr x’) + bj z i * ,+1 + (z btj (Zr * ,+1 ) 

+ m, zy X u + (z mtj (Zr x u ) = Zy - Zr aty 

+ 00 Zr a;* + to Zr x‘ +1 + • • • Ml® Zr 

Z< meaning a summation extended over the *-th group. As (1) is of the 
a-th degree, we have (a + 1) (r — 1) parameters to determine and as many 
equations, the problem thus being in theory solved.* As to the numerical 
solution, Doolittle’s method or any other may be applied. We do not enter 
at present the question, how much labor would the actual solution require. 

Examples. Allen and Bowley in their book on “Family Expenditure” 
(London, 1935) assume the expenditure on some defined item / to be a linear 
function of the total expenditure e 

f = kc + c. (6) 

Evidently Z & = 1, Z c = 0 (cfr. pp. 10-11). Another example I give in a 
paper on seasonal variation, which appeared in “Economic Studies” III 
(Krak6w). Actual values y of a time series are assumed to be linear functions 
of certain “normal” values x 

y = a + bx (7) 

a and b changing from month to month but constant from year to year. Then 

Z a - 0, Z b = 12. 

§2. Methods of solution in special cases. The generally recognized methods 
of solving normal equations become extremely laborious as the product (s + 1) 
(r — 1) grows large. As a matter of fact, the amount of computer’s work is 
approximately proportional to the cube of the number of parameters to deter¬ 
mine. Therefore short cuts seem to be indispensable. A most elegant one is 
at our disposal in the special case 1 when the values of x in the several groups 

* The remaining «*fl parameters Or, b r , ••• m are, of course, found from (3). 

1 This seems to be realized in Allen and Bowley's work. 


) (Zr x’ +l ) = Z, xy — Zr xy + a<> Zr 
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are identical, or, at least, the sums »<, E» x, E« • ■ • E< *** are identical 
in i. Instead of (1) we shall write 

yt = Ai + -BZi + • • • MX, (8) 

where Zi, Zj, • • • Z, are orthogonal polynomials, i.e. such that E Z<Z* = 0 
if and only In general, Z* = X k + aj_i Z* -1 + • • • aj, the coefficients 

being rational functions of n, 2 1 , X x 2 , • • • E s 2 * -1 . 

The conditions (3) can now be replaced by a set of equivalent ones, viz. 

±A t = A 9 SB, = B„ ••• Em, . Mo. (9) 

1 1 1 

How the actual values of Ao, Bo, • • • M 0 are found, will be shown in the next 
paragraph. The solution becomes now very easy, as the normal equations 
for the determination of each set of r — 1 parameters are independent, i.e. we 
can calculate the A’s separately, then the B’s etc., the order of solution being 
of no importance. Moreover the shape of the normal equations permits of 
considerable simplification of solution. Suppose we have to determine the 
values of the coefficients If, corresponding to Xh . The normal equations are 
now—after certain simplifications— 

2K, + K 2 + Z, + • • • Kr-! = ^ 1 - ! (El x,y - E, X h y) + K a 

2L* 

K\ -j- 2Zj + K 3 + • • ■ K r -i — (E 2 X^y — E \rXky ) + Ko 


K\ + Kt -f- K 3 + • • • 2Z r _i 


(Er -xX h y - ZrX.y) + Ko. 


Adding these equations, dividing the sum by r and substracting the quotient 
from the j-th equation, we get 


Kj = 


E,Z* y 

Zxf 



HiX_ 

E *l 


hV 


- tfo). 


(ID 


The first member of the right hand side of (11) should be regarded as the 
principal term: this is actually the value we would obtain for K J f were this 
coefficient independent from the other K 1 s. The second member is a correction 
term, the necessary amount of correction being distributed equally among the 
several K* s. The simple solution given by (11) is only possible if the sum 
23 Xl is the same for each group. From the definition of Xh we see that it 
is equivalent to saying that 2Z*x 2 ; * * • 23* s 2 * be identical in i. As 

h increases to s , we come to the condition given at the beginning of this para¬ 
graph. 
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§3. If this condition is not fulfilled, we can, indeed, replace the power series 
in x by orthogonal polynomials Xk-<, the second subscript being appended 
in order to show that the values of the X polynomials are no more identical 
for the several groups; these polynomials are now orthogonalised separately 
within each group. But we are no more able to predetermine the values of 
An, Bo, • • • M 0 , as they depend on each other; this will be made clear a little 
later. Therefore we have to resort to an approximation: the values of the 
parameters will not be found from simultaneous equations, but successively, 
step by step, beginning with those corresponding to the highest degree of the 
independent variable. 

The values of ao, bo, - • • mo are given. It is evident that mo — Mo. The 
j -th normal equation is now: 

Mi E, Xli - Mo 'ZrXlr + (2 M.) (ErXlr) - EiX,.{y - ErX..,y. (12) 
W« Bfif* at, nnfp that, 


Mi = 


MiEiXli + EiX,.iy - EiX.-t 


iV 


E.x!. 


(13) 


Inserting this into /12/ we get 

EiX..ry 


Mj = 


V EiX, iy 

Zj £7 x*~ ~ Mo 


EiX) E 1 Ei xti 


(14) 


The second member of the right hand side of /14/ is again a correction term, 
the necessary amount of correction being distributed in inverse proportion to 
Now we determine the value of L 0 , this coefficient corresponding 
to 8 — 1, the second highest degree of x, and calculate the several Us from 
equations strictly analogous to (14) thus accomplishing the second step of our 
work, and so on, down to the A’ s. L 0 is found from the following equation: 

U = lo - E [«.*-i(0 • Mi]. (15) 

1 

To aj_i is now appended a bracketed i, this to stress its variation from group 
to group. We see from (15) that before the several M’s are calculated we are 
not in a position to determine Lo. On the other hand, if is the same for 
all groups, the second member of the right hand side of (15) simply reduces 
to aj-i-mo and Lo can be determined in advance, i.e. before calculating the 
M’s. This is the case treated first (in §2). In any case, if no definite corre¬ 
lation Ls to be expected between aj_i(t) and M<, the approximative method 
developed here should give very nearly correct results. The writer applied 
this method of solution to the simple problem of seasonal variation mentioned 
in §1 and found the results very satisfactory. 



A SIGNIFICANCE TEST FOR COMPONENT ANALYSIS 

By Paul G. Hoel 

1. Introduction 

During the last few years several papers and books have been written on 
various aspects of what has been termed component or factor analysis. This 
analysis has arisen from the psychological problem of describing the results on a 
series of tests in terms of a few distinct abilities or components. In much of 
such work it is claimed that there does not exist more than a certain number 
of components, the material discarded in order to substantiate such a claim 
being considered as due to random errors of sampling or errors of measurement. 
However, mere inspection of results or the calculation of standard errors of 
residual correlations is hardly sufficient to justify such conclusions, and there¬ 
fore a significance test of some kind is necessary. Hotelling 1 considered such 
a test but based it upon an uncertain analogy with the analysis of variance 
and upon the legitimacy of using standard errors. The purpose of this paper 
is to derive a test which is more general in scope and in which all assumptions 
are explicitly stated. 

If each test score is thought of as being made up of two parts, a true score 
and an error element, the assumption that there exists fewer components than 
the number of tests implies that the scatter diagram of the true scores will lie 
in a space of correspondingly smaller dimensionality. Consequently, an ideal 
test for the number of components would be one which would test the rank 
of the true moment matrix. In the case of normally distributed variables, 
this line of approach leads one to the sampling distribution of the generalized 
variance. Unfortunately, this distribution appears in unintegrated form; how¬ 
ever, by considering its moments it is possible to find a good approximation 
to this exact distribution for samples which are not too small. 

The paper proceeds by first finding two approximation distributions for the 
generalized variance, one for samples which are not too small and one for large 
samples. It then considers the type of population from which it will be assumed 
the sample was drawn, and finally applies the test to two numerical examples 
from recent literature along such lines. 

2. Approximation Distributions 

Suppose that N individuals have been drawn at random from an n variate 
normal population whose distribution is expressed by 

n 

(1) p(x 1, Xt, • • • , Xn) = Ke 1 

1 Harold Hotelling, Analysis of a Complex of Statistical Variables into Principal Com¬ 
ponents, The Journal of Educational Psychology, September and October, 1933, pp. 21-26. 
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where x < = — m<, An — v - — - A , A is the determinant | p,-, | and A<# is the 

<Tj A 

cofactor of in A, and K = | |V(2ir) n/2 . If the observed values of the 

variables of the ath individual are denoted by X ia (i * 1, 2, • • • , n), then the 

generalized sample variance is defined as z = | an |, where a„ = ^ — -X"*) 

iV a—1 

(X,« — JCy). Wilks 2 has shown that in sampling from the population (1), 
the feth moment of the sampling distribution of z is given by 


r ( 

1~* A 


N + 2k- 
2 


<N + 2k - 2 


)' r ( 


N + 2k - 




where A = #" | j. An inspection of the integrated form of the distribution 
of z in the case of n = 1 and n = 2 suggests that there likely exists a function 
of similar form for higher values of n whose fcth moment can be made to differ 
from Mk only in higher powers of terms which contain as a factor. An 
investigation along such lines leads to the function 

(2) g{z) = Cz m e- n ^ 

N-n n N~n 

, n a 4 n 2 N — n — 2 . , - (n — l)(n — 2) 

•where C = - - - --- ^ , m = - s -, a = Aq and q = 1 - i- ^ 


v-^y 


It will be shown that the fcth moment Ml of g(z) differs from M k only in terms 
of magnitude less than the second and higher powers of k 2 n/N or kn 2 /N. 

Multiplying g(z) by z k and integrating over the entire range of z will yield 
Ml , which turns out to be 




oV*r 


Upon reducing the upper gamma function and performing successive steps of 
simple algebra 


Ml = aT k rr nk [ n 


-X- 


N — n + 2k 


y —"+ M _ _ 2 ) ■ ■ 
*„ -nk(^ , 2k - n - 2/nVi , 2k - n - 4/n 

2 ^i + — ¥ —A + —v— 




- y 4/b ) 


2 k — n — 2 kn/n\ 

v 1+ * r 


* 8. 8. Wilks, Certain Generalizations in the Analysis of Variance, Biometrika, Vol. 
XXIV, 1923, p. 477. 
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The terms in parentheses may be treated as the factored form of a polynomial 

of the nfcth degree in unity. Thus the quantities-^-—, etc., may be 

treated as the zeros with signs- changed of the corresponding polynomial in 
x (say). As a result, the successive terms after the first in the non-factored 
form of this polynomial in unity are the sums of the products of these quantities 
taken one at a time, two at a time, etc. Upon performing this multiplication 
and letting <t> = N n /2 n A , Ml assumes the form 

+ 1) + -] 

where the neglected terms are in magnitude less than the second and higher 
powers of k 2 n/N or kr?/N. If M k is handled in exactly the same manner, it 
will be found that 

(jv + a^i _ k y.. 

^N + 2k — n ^ (N + 2k — n 

- • (l - ji)••• 

X nk(n -2k+ 3) ~\ 

“*L- M -+ 



where the neglected terms arc of the same order of magnitude as those neglected 
in the approximation to Ml . Before a comparison of Mk and Ml is possible, 
the factor q~ k of Ml must be expanded and multiplied into the quantity in 
brackets. This operation yields the result 


Ml = <f> k 1 - 


nk(n — 2k + 3) 
~2N 



Thus Mk and Ml agree to within neglected terms. As a matter of fact, if 
the values of the neglected terms are considered more carefully, it will be found 
that the actual difference between M k and Ml is considerably less than the 
given upper bound for the magnitude of neglected terms would indicate. For 
example, when n = 5 the first term in the difference is 6k (k — .9)N~ 2 , while 
625k*N~ 2 or 25k*N~ 2 is the upper bound for this term when only general results 
are used. The general formula for the first term in this difference has been 
obtained, but since the remaining terms have not been investigated and since 
the type of problems to which the distribution g(z) is to be applied does not 
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justify this refinement, it will not be considered here. Consequently, if one 
considers this distribution function as sufficiently determined by its low order 
moments and if one applies g(z) only to problems in which N is fairly large 
compared with n, then the function g(z) will give a good approximation to the 
exact sampling distribution of z. Obviously, g(z) is identical with the exact 
distribution for the known cases of n = 1 and n = 2. It is not possible under 
the above expansions to vary the constants in the form of g(z) in such a manner 
as to obtain an approximation whose fcth moment will agree with ilf* to within 
still higher powers of comparable terms. 

In order to test whether or not a sample value s — Z can be reasonably 
assumed to have been obtained in random sampling from a population of type 
(1) with fixed A, it is necessary to calculate the probability P of obtaining in 
repeated samples a value of z greater than Z. Thus it is necessary to evalua te 

1 — jf g(z) dz. 

Upon making the substitution x = ny/az, and letting p = n — w — 1 and 

u = n</az(n — ~ W ) = ~ — —— j [2n(AT - n)]~\ this 

integral can be reduced to the standard form of the incomplete gamma function. 
Hence P assumes the form 

(3) P - 1 - I(u, p) 

where 

IW ’ p) - fgr+i) l 

In many applications of this distribution it will be found that the values of 
u and p lie beyond the tabled 8 values of these constants. Consequently, it 
will often be sufficient to use the normal distribution to which the gamma 
distribution tends as N becomes large. This normal distribution will be 
considered next. 

Rather than obtain a normal approximation to g(z) or the gamma function 
to which g(z) reduces after the above transformation, it is more illuminating 
to find the basic descriptive parameters of the exact distribution of z and from 
them obtain a normal approximation. Such a procedure will show how rapidly 
the distribution of z approaches normality with increasing N. By using the 
recurrence formula connecting Mk+i and Mk , which can be found directly from 
the ratio of these two moments, and expressing the necessary moments in 

* K. Pearson, Tables of the Incomplete Gamma Function, Biometric Laboratory (1022), 
Univ. of London. 
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terms of Mi, it can be shown that these basic descriptive parameters are expres¬ 
sible in expanded form as follows: 

„ Ti n(n + 1) . »(n -)- l)(n — l)(3n + 2) , "I 


X «(» + 1) , »(n + 

= *|_ 1 -av— + - 

2 f 2n n(2n 2 — n + 1) 

= <t, l~N - IP 


24 AT 2 


2(3n - 1)T (n + l)(5n - 3) 
nN L 2(3n - 1 )N 

,r, , 4(3n — l)(4n — 1) , 

3 M + --- + ••• 


These values suggest that 


“fit;- 1 ! 


will likely be distributed approximately normally with zero mean and unit 
variance. As a matter of fact, by using the second limit theorem of probability, 4 
it can be shown that the distribution of w approaches normality as N increases 
indefinitely. Hence, for samples in which N is large compared with w 2 , it 
will be sufficient to compare the value of w arising from a sample z = Z with 
its variance of unity if a test of significance is desired. A better general ap¬ 
proximation could have been obtained by centering the curve at 

rather than at <£; however, since there is positive skewness and the true mean 
lies between these two values, there might arise some exaggeration in a signifi¬ 
cance test in doing so because the accuracy of such a test depends upon the 
accuracy of the approximation in the right hand tail of the curve. 

Inspection of (3) and (4) shows that the only population parameter upon 
which these approximation distributions depend is <t>. There are no assump¬ 
tions necessary about the population means, or variances, or covariances, 
except in so far as they may be related when the value of 4 > is postulated. This 
means that either (3) or (4) enables one to test whether or not it is reasonable 
to assume that the sample variance z = Z arose in random sampling from some 
normal population with <t> equal to the postulated value. 



3. Population Assumptions 

Consider the set of variables ui, u*, • • • , u n distributed according to the 
normal law 

n 

-2 &*) «** «*/ 

( 5 ) P(ui, ut, •• •, u n ) = Kie 1 

4 See, for example, Frechet and Shohat, A Proof of the Generalized Second Limit 
Theorem in the Theory of Probability, Transactions of the American Mathematical So¬ 
ciety, Vol. 33, (1931), p. 633. 
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and the set of variables vi,vt, ■ • • , v„ distributed according to the normal law 
(6) />(*,«*,••• ,v„) = K.J* 


where the v’s are uncorrelated with the u's and with each other. 

distribution of the u’s and v’s is expressed by 

* 


(7) 


P(u,, • • • , v») 


-2 k<y*< <*/-2 «<i 

K»e 1 1 


The joint 


Upon writing down the determinant of the coefficients of these 2n variables, 
it will become evident that any one of its principal minors of any order can be 
expressed as the product of a principal minor of | bn | with a principal minor of 
| Ci | . Since the distributions (5) and (6) are normal, the determinants | ft,-,-1 
and | Ci | are positive definite; consequently the determinant of the coefficients 
in (7) must also be positive definite. 

Now consider the orthogonal transformation 


Vi 


Ui + Vi 

V2 ’ 


i * 1 , 2 , • • • , n 


y< = ^~aT’ * = n + 1, • * • , 2n. 

V2 

Since the determinant of the coefficients in (7) is invariant under an orthogonal 
transformation, the resulting distribution of the y’s may be expressed by 

2« 

“2 dijViVj 

(8) P(y\, 2 / 2 , • • • , 2/2 n) = K<e 1 

where | da | is positive definite. 

In order to obtain the distribution of the variables y x , y 2 , • • • , y n , it is 
necessary to integrate (8) with respect to the variables y n + 1 , • • • , y%n over 
their range of values. If this integration is performed after the quadratic form 
in the exponent of (8) has been expressed as a sum of squares 6 with coefficients 
which are the ratios of principal minors of | da I , it will be clear that the inte¬ 
gration leaves a quadratic form in the exponent which is also positive definite. 
Hence after the transformation Xi = \/2yi{i = 1,2, • • • , n) the distribution 
function of the variables Xi = Ui + Vi(i = 1,2, • • • , n) must be normal and 
may be expressed by (1). Thus it has been shown that if the true parts Ui 
of the variables x,- are normally distributed without error and if the error parts 
are normally distributed but are uncorrelated with the w»- and with each 
other, then the variables x< possess a normal distribution. The advantage of 


1 See, for example, Kisser and Traynard, Les Principes de la Statistique Mathematique, 
1933, p. 223. 
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this formulation will beeome evident when the parameter 4> is expressed in 
terms of the parameters of (5) and (6). 

Since the v’s are uncorrelated with the u’s and with each other, the variance 
<r< of Xi is the sum of the variances of and t><, while the correlation p<, be¬ 
tween Xi and Xj may be expressed in terms of the correlation p'a between u< 
and Uj and the variances «*,«*,»»*, v) of w,-, u it V{, v,- respectively. These 
relationships are 


(9) 


i 

<Ti 


Mi + 


and 


Pa 


t 

Pij 


+ v!/m!kI + v )/ p )) 


a * j). 


For simplicity of notation let . Now it is well known* that <f> can 

be expressed in the form 


. 2 2 
<f> =: <Ti 02 


PH 


If the values from (9) are inserted in \ pa \ and if the resulting denominators 
of elements are factored out, 4> will assume the form 


2 2 2 D 
__ 0i (Tf (Tn& 

(1+Xi) ... (l +x n ) 

where 

1 + Xl Pl2 Pin 
/ 

P12 

B = 


Pin. 1 X n j 

Following the methods of confluence analysis, 7 B can be expressed as follows: 

n n 

B = R 4“ X X a jR) a ( + 53 X a X/3/2)a/3( + * • * + XlX* • • • Xn 

a—1 a<0 

where R = | p t -, |, R )ct ( is the principal minor of R obtained by deleting row 
and column a, etc. R is the true correlation determinant whose rank it is the 
object of this paper to test. If R is assumed to be of rank n — t, then all 
principal minors containing more than n — t rows vanish and B reduces to 

n 

B SB 53 X«i X<* a * . . X a< R) ai a 2 . . .« t ( -j" * * * + X 1 X 2 • • • Xn. 

«*1<*" *<®l 

The tests (3) and (4) were designed to test hypothetical values of <j> by means 
of the sample Z. Evidently the value of <t> can be postulated by assigning 
hypothetical values to the X’s, the a’s, and the principal minors of R. 
Assigning values to the X’s does not curtail the degrees of freedom in these 

• S. S. Wilks, loc. cit., p. 477. 

7 Ragnar Frisch, Statistical Confluence Analysis by Means of Complete Regression 
Systems, Oslo, 1934. 
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tests because they were derived on the basis of (1) which depends only on the 
m’ s, <r's, and p's. The X's do restrict the range of the p's, but not their degrees 
of freedom. 

An inspection of the expression for <f> shows that <f> can be made to assume 
any desired value irregardless of the rank of R by merely assigning the <r's 
properly. It is therefore necessary to make some assumption regarding the 
cr's if the test is to serve the purpose for which it is intended. Here it will be 
sufficient to assume that the product of the population variances may be re¬ 
placed by the product of the sample variances. This assumption will ordinarily 
be approximately fulfilled for the size samples for which it is legitimate to 
employ (3) or (4); consequently this assumption does not restrict the range of 
application of the test. 

To postulate values of the principal minors of R beyond postulating the rank 
of R would introduce hypotheses and restrictions which are irrelevant to the 
fundamental purpose of the test. This difficulty will be avoided by replacing 
all non-vanishing minors of R by their upper bounds of unity. Since this 
will overestimate the value of B, and hence of <\> } the usual significance level of 
.05 may be considered as decisive. Let the value of B when unity is inserted 
for all non-vanishing principal minors be denoted by D . Then 

w 

(10) D = Xafj Xafj • • • X a< -j- • • • + XiXj • • • X n . 

«i<* **<«< 

Since 

n n n 

XI (1 + X») = 1-4- X a -4“ 53 X ttl X aj + * • • + XlXf • • • X n 

1 a—1 tti<aj 

it will often be convenient to write D in the form 

(11) D = XX (1 + X t ) — /l + 2 X a + • • • + 53 

1 I, a-l 

As a consequence of all the above assumptions, 


( 12 ) 


Z __ | dij | _ (1 + Xi) * • • (1 + X n ) | Tij 
4> <t> B 

^ (1 *4“ Xi) * ' * (1 + X n ) | Tij | 


where | r a | is the sample correlation determinant. 

All the essential material for testing the rank of the true correlation matrix 
is contained in (3), (4), (11), and (12). In summary, the hypothesis to be tested 
and the procedure to follow in performing the test are as follows. 

The population of n variables from which the sample is supposed drawn is 
assumed to be such that (a) the true parts of the variables are normally dis¬ 
tributed, (b) the error parts are normally distributed but are uncorrelated 
with the true parts and with each other, (c) the product of the variances may 
be replaced by the product of the sample variances, (d) the values of the X's 
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are postulated as judged by the accuracy in measurement of the variables, and 
(e) the rank of the true correlation matrix is n — t. 

Given the value | r< } - | of the sample correlation determinant, a lower bound 
for the value of Z/<t> is calculated from (11) and (12). This lower bound is 
inserted in either (3) or (4), depending on the size of the sample. If (3) is 
used and if P £ .05, or if (4) is used and w «£ 2, one may conclude, as judged 
by the sample variance, that it is very unlikely that the sample was drawn in 
random sampling from the population specified above. If one has reason to 
believe that the variables are sensibly normal as indicated above and that the 
postulated values of the X’s are quite accurate, then the test shows quite defi¬ 
nitely that the postulated rank of the true correlation matrix is unsubstantiated 
by the sample, and therefore a higher rank should be tested until a non-signifi¬ 
cant value is obtained. Because a lower bound rather than the value of Z/^ 
is used, the test can be used on minimum ranks only, and hence a value of 
Z < <l> will not yield a test of significance. However, the test does handle the 
problem for which it was designed and which is of fundamental interest, and 
that is to see whether or not one is justified in assuming that a sample repre¬ 
sents only a certain minimum number of components. 

4. Applications 

(a) Hotelling 8 has used an example taken from other sources to illustrate 
his test on components. In order to compare results, this same example will 
be treated here under the assumptions outlined above. In this example the 
reliability coefficients are given. From the definition of a reliability coefficient 

r,-, it follows at once that r< = j- . The population values of the X’s will 

be set equal to the values obtained from these sample reliability coefficients. 
The data for this problem are 

| ft/1 = .235, N = 140, n = 4, X x = .087, X 2 = .119, X,.= .101, X 4 - .773. 

Assume that the true correlation matrix in the population is of rank two, that 
is, that two components are sufficient to describe the results on these tests. 
Since N is large compared with n 2 , it will be sufficient to use (4). The values 
of (11), (12), and (4) are found to be 

Z> = fid + X<) -{l + Ex a ) = .294 

? ^ n (i+^t) i Tg [ = | qq 

<f> D 

w 2: A A P [1.90 - 1] = 3.76 
V © 


• Loc. cit., p. 16. 
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Since the standard deviation of to is unity, this value demonstrates clearly 
that the hypothesis of only two components is untenable as judged by the 
sample correlation determinant. If one assumes three components, the test 
will be found to yield a non-significant value. Hence it may be concluded that 
under the hypotheses on which the test is based, the sample does not justify 
the assumption of less than three components. Hotelling’s test indicated the 
necesfflty for two components but was uncertain about the third, the decision 
resting upon a variate value of 1.31 as against a standard deviation of unity. 

(b) Thurstone, in his “Vectors of Mind,” considers an example taken from a 
series of fifteen psychological tests. After applying his centroid method to the 
data, he inspects his results and concludes that four components are sufficient 
to account for everything except random errors. It is impossible to test his 
conclusions explicitly as above because the size of the sample is not given and 
the reliability coefficients are not known. Nevertheless, if it is legitimate to 
assume that the sample is sufficiently large to justify the use of this test, in¬ 
teresting conclusions can be obtained on the assumption that only four com¬ 
ponents are needed. 

Suppose that X< = £, which implies that the variance of error is half as large 
as the true sampling variance for each variable. Here (10) is more convenient 
than (11) for computing the value of D. The values of (10) and (12) are 
found to be 

D . „Ca(*) 12 + »C 2 (*) U + M) U + (I) 16 = .125 

z > 

<t> - .0003' 

Evidently, the value of | r <} -1 must lie in the neighborhood of .0003 if the test 
is not to yield a significant result which contradicts the hypothesis. However, 
the correlations in | r {j | are given to only three decimal places, and therefore 
a legitimate value in the neighborhood of .0003 can not be realized. It is to be 
noted that the postulated values of the X’s are equivalent to postulating that 
all reliability coefficients are equal to f, a value which should be considered as 
unusually low. It would seem reasonable to avoid using material in which the 
variance of error is larger than one-half the variance of random sampling, unless 
the variance of random sampling is exceedingly small. 



CONTRIBUTIONS TO THE THEORY OF COMPARATIVE STATISTICAL 
ANALYSIS. I. FUNDAMENTAL THEOREMS OF 
COMPARATIVE ANALYSIS 1 

By William G. Madow 

This is the first of several papers in which there will be presented a general 
approach to the statistical examination of hypotheses which are false if any of 
several things are true. Phenomena requiring such a statistical theory are 
investigated quite frequently. As examples may be cited the studies of lag 
correlation in time series, periodogram analysis in geophysics, factor analysis 
in psychology, and analysis into components in agriculture. 2 * 

The theorems of this paper have one purpose: to permit the reduction of the 
distributions by which the hypotheses are to be tested to essentially the joint 
distribution of the statistics which contain the information offered by the data 
concerning the truth or falsity of the things which will negate the hypotheses. 
In order to do this it has been necessary to generalize the theorem of Poincare 
on the probability that at least one of several events occur. 8 As illustrations 
there are stated, after Theorems III, VI, and IX, generalizations of a distribu¬ 
tion derived by Jordan, (5) page 109. 4 * 

In a second paper, we shall give a complete derivation of the joint distribu¬ 
tions necessary for the applications of the analysis of variance. A reconsidera¬ 
tion of the Schuster periodogram will be included. In other papers these 
results will be extended to problems arising in the theory of regression, and to 
problems of the distributions of medians, etc. 

The fundamental theorems of comparative analysis are now obtained in such 
a form that they are applicable to problems in the theory of probability no 
matter what the distributions may be. Some special cases of these theorems 6 * 


1 Presented to the American Mathematical Society, March 27, 1937. Research under a 
grant-in-aid from the Carnegie Corporation*of New York. 

* Naturally these techniques are also useful in other branches of science then those in 
which they were first applied. It should be noted that by analysis into components we 
here refer to the work of Fisher, (2), chapter 6. 

* See, Poincar6, (7), page 60. This theorem is attributed to Poincar6 by Jordan, (6), 
and Fr6chet, (3). 

4 This distribution states the probability that in r trials of an experiment which has 

exactly n possible results, these results being mutually exclusive, each of the possible 

results occurs at least once. Jordan’s derivation has been simplified by Fr6chet, (3), 

page 12. 

4 The theorems are, of course, part of the theory of measure and integration. 
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have been used in connection with the derivation of distributions of positional 
statistics such as the k th in order of N elements,* and others. 

Let 0 be a collection of elements x, and let A be a set of subsets of 0. Then, 
the axioms which the elements of A are to satisfy are 7 

I. A is a field;' 

II. ft c A; 

III. To every At A there is ordered a non-negative real number P(A); 

IV. P(8) = 1; 

V. If A t A and Be A, and AB = 0, then P(A + B) = P(.A) + P(P). 
We shall regard ft as the set of possible results of an experiment t. By events 
we shall mean elements of A. The complement A of A with respect to 0 will 
be an element of A if A is an element of A. A consists of all elements of 8 
which are not elements of A and hence is the event which occurs if and only 
if A does not occur.’ 

Let the subsets of ft 

(1) Ei , Ei, • • • , Ek 

be elements of A. Then, if «i, a*, • • • , a* is a permutation of 1, 2, • • • , k, 
the set 

(2) E ai E a , ••• E aj E aj+l ••• E ak 

is an element of A and is the event which occurs whenever all the events 
Eai, E ai , • • • , E aj occur, while none of the events E a , +l , E aj+t , • • • , E at 
occur. 

The events (1) are said to be independent if and only if 

(3) P(E ai ■ ■ ■ E ai E ai+l • • • Sj - ft P(EJ • A P(£J 

F-l »>«?+1 

for all selections of the sets (1) and their complements. 10 

Theorem I. The probability that the first j of the k events (1) occur , while the 
remaining k — j events do not occur, is 


• See, for example, Gumbel, (4). It is noted that Theorems I, II, and III are stated by 
Arne Fisher, (1), page 42, who assumes, however, that the events are independent. 

7 These axioms are stated by Kolmogoroff, (6), page 2. 

8 A set of sets is a held if the fact that A and B are elements of the set implies that 
A + B,AB, and A — AB are also elements of the set. 

• The event A will be said to have occurred if the result of the performance of the experi¬ 
ment E is an element of A. 

10 See Kolmogoroff, (6), page 9 for a discussion of various equivalent definitions of 
independence. 
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(4) B(Ei * *' Ej$i+i * • • $k) = ("" 1) iC PC^i • • • EjB a i • • • !£«,). 

Proof. Let k = j + 1. Then it follows from Axiom V that 

(5) P(E\E* • • • JS/) = P(E\Ei • • • EjEj+i) + P{E\Et • • • 

Hence the theorem is true for k = j + 1 and any j > 0. Let the theorem be 
true for k = j, j + 1, . • • , k — 1. From Axiom V it follows that 

(6) P(E% •. * EjEj+i • • • A) 

= P(Pi • • • Ej£j+i • • • J?*~i) — P(2?i • • • Ej£j+i • • • Ek-iEk). 

Substituting from (4) the theorem is proved. 

Let n > ni + ‘ , 7i* > 0 (i = 1, • • • , t ); and let 


n! 

WJ n*! • • • n*! (n — ni — • • • — n t )! 


(n;ni,n*, • • •, n,). 


Corollary. If, for each value of v y (v = 1, 2, 
terms 


,k - j), the (k - j; v) 


P(E X ■ ■ • EjE ai • • • E a ,) 

which can be obtained by selecting ai, ata, without repetition from 
j + 1 , j + 2, • • • , k, are all equal, then 

(7) P(Ey • • • E,S W •. • A) - 2 (-l)’(fc - j; v)P{E\ • • • A + r). 


Let 

( 8 ) 


sw = E P(8.,fi. 


«!<• * *<«P 


*..) 


where the summation extends over the (fc; y) terms 
(9) P(Ea\E a 2 • • • E a ,) 

which can be obtained by selecting v of the k events (1) without repetition. 
If all the terms (9) which can be obtained by selecting v of the k events (1) 
without repetition are equal, then 


( 10 ) 


S(p) = (fc; p)P(Ei • • • E,). 


11 By definition 

£ (-D’ E p( - E ■ • • • E >- E >+1 • • • *«.> 

P“»0 «1»" sar—y+l 

P(£i • • • Ej) + 2 (-1)' E • • • */*«. • • ■ E '«->• 

r—1 aj,* • 

«!<* * *<«r 
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Theorem II. The probability that exactly j of the k events (1) occur is 

(ID Pw - £ (- TO + *; y)S(J + v). 

Proof. If A a) is the subset of G defined by the requirement that exactly j 
of the events (1) occur, then A U) is the sum of (k; j) disjunct sets: 

k 

(12) A(i) = 2 * • * EajE a , + l • • • Sa k , 

«it* • •••/•I 

where a/+i ,•••,«* have those of the values 1, • • • , k which remain after the 
selection of ai , • • • , a#. By Axiom V we may replace A by P in (12). Upon 
substituting from (4) we note that the resulting terms of (12) which depend on 
the same number v, v — j, • • • , k, of events have the same sign, that all S(v), 
v — j, • • • , k, occur, that no term depending on fewer than j events occurs, 
and that any particular P(E ai E a , • • • E aj+t ) will occur in those of the terms 
of (12) the j occurring events of which are a subset of E ai , E a ,, • • • , 2?„, + , 
and will occur in no other term of (12). Hence the coefficient of S(j + t) in 
(11) is (—1)* (J + t; t). This completes the proof of the theorem. 
Cobollary. If (10) is true for v = j, • • • , k, then 

(13) = £ (-1.)'(Jb; j, v) P(EiEi • • • E i+f ). 

y -0 

Theorem III . The probability that at least j of the k events (1) occur is 

(14) P (y) - £ (- lYij + * - 1; r) S(j + v). 

Proof. If A 0) is the subset of G defined by the requirement that at least j 
of the events (1) occur, then A 0 ’ is the sum of k — j + 1 disjunct sets: 

(15) A (,) = Au) + Ao+u + • • • + A(»). 

By Axiom V we may replace A by P in (15). Substituting from (11) 

(16) P U) = £ c, S(j + v), 

VwmQ 

where 

c, - (j + V\ j + v) - O' + v; 1) + • • • + (-1 YU + v; v), (y = 0, • • •, k - j). 
It is easy to prove that 

(17) (-1 YU + •’-!;*') = £<- D'-'U + j + m). 

(1-0 

Corollary. If (10) is true for v — j, ■ • • , k, then 

P U) = £ (-l)'O’ + * - 1 ;r)(k;j + v)P(EiEi ■ ■ • E i+ .). 

F -0 


(18) 
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To provide examples illustrating these theorems let us consider r experiments 

(19) E m , E (t \ ■■■ , E m 
Let E l,) have k mutually exclusive outcomes 

(20) Of, Of, Of. 

Then, it is easy to define the spaces A (,) the probability function Pj(E <0 ), 
the combinatory product 

Q = Q (l) X fl <!> X • • • X fi (r) , 

the set A and the probability function P(E) so that Axioms I, • • • , V are satis¬ 
fied and hence Theorems I, II, and III are valid. 

We shall assume that the experiments (19) are independent. 

Let 

0, 0 = 1 ,...,*) 

be the event which occurs when neither Of nor Of nor • • • nor Of occur. 
Then 0/ occurs if upon performance of the experiments (19) at least one of 
Of,Of, • • • , Of occur. 

It is an immediate result of the definition of independence that 

(21) p(0 ai o a , - - 0 aj ) = n 11 - P(oiV)-P(oLV)}. 

•—1 


From Theorem I, the probability that Oi, Oj, • • • ,0, each occur while not 
one of Oj+i, Oj+t , • • • , 0* occurs is 


P(Ox 


( 22 ) 


OA+i •••&) = £ (-1)’ £ 


-.or l 
«!<• * ’<<** 


n fi - p(of\) 


p(of) - P(OLV) 


P(oLV)l 


From Theorem II, the probability that exactly j of Oi , O 2 , • • • , 0* occur is 
(23) P<„ = £(~1)'(* - j + v; p)S(* - j + r), 

P-0 

where 

soc - j +v) = £ n { 1 - p(o£)-p(oiv-, + .) i. 

ai.aj.* • *.«fc~/ + p—l *— 1 
«l<a»<***<a*-y + p 


Since the probability that at least j of Oi, 0*, • • • , 0* occur is equal to 1 
minus the probability that at least k — j + 1 of 0\ , (5*, • • • , 0* occur, 12 it 
follows at once from Theorem III that 


(24) 


P{at least j of Oi, • • •, 0* occur} = 

1 ~ 2 (~ !)'(* - j + v; r)5(fc - j + » + 1). 


lt There are, of course, other ways of computing these probabilities. 
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The case treated by Fr6chet and Jordan is that which occurs when we assume 
P(Ot k> ) = P(0<°), (< = 1 , ■ ■ ■ , k), (i, h = 1, ■ • ■ ,r) and in (24) let j = 1. 

It is not difficult to obtain further generalizations of Jordan’s distribution by 
defining events which occur if and only if fewer than j' of r events occur and 
then proceeding as above. 

Certain useful generalizations of Theorems I, II, and III will now be derived. 
Let the subsets of Q 

(25) E[‘\ Ef\ • • •, El\l (« = 1, • • •, p) 

be elements of A, and let N = k a) + fc <2> + • • • + k (p) . 

Let j U) < k M , (s = 1, • • • , p); and let 

( 26 ) q (<) = n'n Pi #) « = i, ...,p), 


Let 

(27) e (,, ' = ri n sr « = i,..., P ). 

Furthermore, let for each value of $, (s = A, • • • , p), the ( k (,) — f $) ; v if) ) 
possible distinct selections of v {9) of the k {9) — j {9) sets 

(28) £#>+*, ■■■,E ( k , <)> 

be arranged in some order, and, if the intersection of the v ( ' ) sets of the t, th 
selection be denoted by 


(29) 
let 

(30) 


g’V') (* = h, • • •, p), 

(*. = 1, 2, • • • , (k M - j M ; ,/*>)), 

• • •, »' (P) ) = fl qHv (,) ). 

$-h 


There are ft (k i9) — j {9) ; v 00 ) sets (30), for each value of ft, (A = 1, • • • , p), 

and any set of fixed values of v {h \ • • • , p (p) . 

Let for each value of $, (s = ft, • • • , p) the (fc (,) ; p (-) ) possible distinct selec¬ 
tions of v {9) of the fc (a) sets 


(31) E\*\ (i - 1, • • • , fc ( '>), 

be arranged in some order, and if the intersection of the sets of the i 9 th selection 
be denoted by 

(32) g> (,) ) 


let 

(33) = ft q% (,> )- 
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, p), and any 


There are ft (k w ; v (,) ) sets (33), for each value of h, (h = 1, 

•—4 

set of fixed values of v ih) , • • • , v ip \ 

It is clear that the various sets that have been defined are elements of A. 
The fact that the sets are the events which occur if and only if certain sets of 
events occur is also too obvious to require further comment. 

Theorem IV. The probability that of the N events (25) the first j U) of super¬ 
script s occur and the remaining k (t) of superscript s do not occur , $ = 1, • • • , p, is 

fc (ih-jU) 

p(q (p) Q ( p) ) = E E ••• E (-D' 

p( XL-0 0 p(pT—0 

(jfc(l)~i (!);»(!)) (fc(p)-y(p); 

E ••• E 

*1-1 *0-1 


j r (l) +r (2) + ... +r (p) 
(fc(p)_y(p);y(p)) 


Proof. Theorem I is a proof of Theorem IV for p = 1. The theorem may 
then be proved either by regarding it as a special case of Theorem I and col¬ 
lecting terms, or by induction. 

Corollary. If, for each possible set of values of v a \ v (2) , • • • , v (p) the 

ft (k u> -/V 1 ) 

a-1 

terms 

(35) P[?* V -V 1 ', ••• ,f W )] 

are all equal, then 

fcPL-yP) 

P(Q w Q ip) ’)= E ••• E (-1)' U)+ ”- 

v(lL-0 f(p)— o 


(36) 






«-i 


Let, for each value of h, (h = 1, • • •, p), 

S(r M , » (k+l) , 


(37) 


(*<*>;„(»)) 

= E 

<4-1 


p(p);v(p)) 


(p> 


)]• 


It is apparent that by using (34) it is possible to obtain an expression for (37) 
which does not depend explicitly on Q (h ~ iy . In fact 

ft(l )—j ( 1) 4(4-1 )-f (4-1) 

s(v (k) , • ■ • ,v w ) = E ••• ..E (- 1 ) ,a>+ , + ' < ‘' ,) 


|r (A — l )mQ 


(38) 


(t a )_y(l);,(!)) 


sr 

< 1-1 


(fc(4-l)_y(4-l) ;F (4-l)) (*P>;„<4)) 

E E 

<4—1—1 <4-1 

1 /( 1 ) 


E 


<»—i 


... 


- <p) )]. 
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If the different terms of (37) are all equal, then 
(39) S(, (W , - ft (k w ; v (,) )P[0 (i “ 1) Q ( * -i) O'®, ■ • • •, r w )]. 

••-A 

If the different terms of (38) are aU equal, then 


8(p (h) , 


,.<p) 


) 


(40) 


*UL^<1) 


p(ulo 

- ,«5 

h-l 

n (k u) 

-/'V'* 5 ) 

«-i 

pi? 1 -- 


(- 1 ) 




.-A 




V”)]- 

Theorem V. The probability that of the N events (25) the first j (,) of superscript 
s occur and the remaining k (,) do not occur, (s = 1, • • •, h — 1), and exactly j (,) 
events of superscript s occur (s = h, • • •, p), is 


P(jd>) .. #<p),(Q < * -1> Q (h ~ v ') 


(41) 


jfc(»)-,-(A) 

(a— i) /-,<*-!) '■j _ y< 

,(fslo 


*(p)__.(p) 

E 

r (p)—0 


(- 1 ) 




ft (j M + r M -,v U) )S(j M + / k \ • • -,j W + r w ). 


l—h 


Proof. The theorem may be proved, either by induction using Theorem II, 
or by obtaining disjunct sets as in Theorem II and using Theorem IV. 

Corollary I. If (39) is true for all sets of possible values of v (h) , • • • , v (p) 
then 




(42) 


k (h)-jU) fc (p) J= y(p) 

J2 ••• yj 


ft v w ) P[Q ( * _I> Q ( * -1> ’g I '" , (y (l ‘ > , ..., P w ). 


t—h 


Corollary II. If (40) is true for all sets of possible values of v , v , 
then 


(i) ..(2) <r> 

* ” 


Jk(l)-y(l) *(p >~;(p) 

P ii a>...,iMQ (k - 1) Q (h - iy )= E •• ~ 

V (1)—o p (p )-.0 


>)~j (p) 

£ (- 1 )'“ 


>+...+,(p) 


(43) 


n ft" -/•>;,“) ft „'•>) 

8-»l B—h 

Plq'-V'\ ■■■,v (h - l) )q 1 - V h) •••»>>)]. 


Theorem VI. The probability that of the N events (25) the first j M events of 
superscript s occur and the remaining k is) do not occur , s = 1, • • • , g — 1, exactly 
j (8) events of superscript s occur (s = g y • • • ,h — 1), and at least j ($) events of 
superscript s occur (s = h, • • • , p) is 



THEORY OF COMPARATIVE STATISTICAL ANALYSIS 


167 


<;(>)—j(>) 


(Q (s_1) Q (e_1) ') = £ 

vGTm. 


k(p)-i(p) 

£ (-D' (,)+ - + ' ( ' > 

^(pTLio 


(44) 


ft (i w + v ( * ) ; ,«) ft (/** + r w - 1; p < * > ) 

5(i (s) + v (9 \ + * < ’ ,) ). 


Proof. The theorem may be proved either by induction using Theorem III 
or by obtaining disjunct sets as in Theorem III and using Theorem V. 
Corollary I. If (39) is true for all sets of possible values of 






then 

—y(g) 

fc(p)-y(p) 







r(7)—0 



(45) 

ft (fc l,> ; * u> ) ft 10 ,(,) + 

- l;/* ) )(fc ( * ) ;/* > + /*>)] 



a— g 

... 

, P W )]. 

Corollary II. If (40) is true for all sets of possible values of v w , v { '\ • 


then 

*(l)-j(l) 

&(p)-y(p) 



p\)\l !:::j^» ) (Q ( »- u o ( -'- 1> ') « £ 

a(l)-0 

y ^-Tl'(P ) 

„(p7Lo 


(46) 

ft (fc u> - j l,) ; v ( ”) II (fc (,) ; v M ) ft 

[(/•“ + -‘" - l; 

+ P (,, )l 


a*»l s—y s*~/i 




Plq 1 ' 1 

fW', ••• 

■,* W )V 


Let us again consider the experiments (19), and let us assume that 
E {t \ (i = 1, • • • , r) has as its mutually exclusive results 

(47) 0^ (*-l,...,jfc w );(a-l,2). 


Let Ot, be the event which occurs if, upon performance of the experiments 
(19) at least one of the events 0(1*, • • • , 0 ( u occur, and let 0„ be the 

event which occurs if and only if O t . does not occur. 

We may state the probability that the event Ei , which occurs if and only if 
at least j w of the events On , (< = 1, • • • , fc (1) ) occur, and the event Et , which 
occurs if and only if at least j w of the events 0« , (f = 1, • • • , k m ) occur, both 
occur. 

It is apparent that 

(48) P(£i&) - 1 - P(£i) - P(Et) + P(A&), 

where St is the event which occurs if and only if E, does not occur, (s = 1, 2). 
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From Theorem III 


P{£.) = E - j M + v (,) ; v l,) )S w (k U) - ;<*> + V (,) + 1) 

><•7-0 


(s = 1,2), 


where 


*<•> 

S w Qc w -/•> +v (,) + i) = Z 

«l»* * ’.«*(•)—y < • 
«i<* •*<«*(•)— 


XI {1 — P(Oaia) — ... — P(Oi^(.)_y(,) +|f (, )+l «) J> (s ~ 1 , 2). 


From Theorem VI 


( _l)„U>+„(,) jj (jfc <a> +r (.) ;jr (.) } 

«(*" + f c1> + 1, Jb» + 1), 

where 

(*<1);,-<1)_,,U)_1) (*(*);,• (2)-„(2)-.l) 

s(fc a) - i (1) + , a) + 1 , fc (2) - / 2) + v (2) + 1 ) = E E 

*1—1 *2-1 

P[g*'“(ft" - j a) + r <u + 1, & <2) - + „ <2) + DJ, 

and 




p(M) = Z E 

,TiT_« ,<T7-o 


P[? ,I,1 (fc a) - j W + , (1> + 1, /fc (2) - / 2) + , <2) + 1)] = 

r ( *C1)_ # (I) + ,(I)+1 *(*)—/ («)+»(«)+l 'l 

n 1 - E P(oLVi) - E P(o$)k 

i-l k V-l M-l J 

the subscripts a,, (v — 1, • • • , fc a) — j a) + v a) + 1), being those of the i 2 th 
selection of k a) — j m + v (1) + 1 events from k w events, and the subscripts 
/9„ , (m = 1, • • • , fc <2) — j w + v m + 1), being those of the t** h selection of 
k m — j m + r <2) + 1 events from k {2) events. 

The desired probability is then obtained by substituting from (49) and (50) 
into (48). The procedure is perfectly general, and applies directly to situations 
in which p > 2. 

We shall now investigate the results obtained by requiring that the events 
considered satisfy a relation of implication. 

Let the subsets of Q 


(51) 

be elements of A, and let 

Eu , Et, , • • • , Ek>, 

(s = l,.. 

•. v), 

(52) 

Ei, C E u , 

(i = 1, • 

■,k), 

if s < t. 
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It follows that 


(53) P{EM = P{E U ), (t - 1 0 

Let ji < jt < • • • < jt and let 

(54) 

Let ji < jt < • • • < jt and let 

(55) 


- n n 
*~1 


«= 1,2, ...,p). 
(« = 1,2, ---.p). 


(56) 


From (52) and (53), it follows that 

p(q<q;) = p([n ft Eu I 

\L*~ 1 *“J«-i+l J 


i y«+i "1 * \ 

n n n Bui (j<> — o) « = 1,2,..., P ). 


Let ji < ji < • < j P and for each value of a, ($ = p), consider a 

selection of j g + v 8 events of second subscript s from (51). Let the p selections 
thus obtained be such that 

j* “f“ V* ~ 3*+' 1 1 (® h 2, * ’ ‘ , p), (jp-fl ^)y 


and if is one of the events of the selection of events of second subscript $ 
then the fact that t > s implies that E it is one of the events of the selection of 
events of second subscript t. 

From (52) and (53), the probability of the occurrence of all the events of the 
p selections thus obtained is a function of j p + v p events, m« of which are of 
second subscript s, (s = 1, • • • , p) where 

(57) Ml + M2 + * * • + = js + V ay (S = 1, • • • , p), 

and for a given set of values of ji , j 2 , • • • , j P the m« and v a determine one another 
uniquely, (« == 1, — , p). 

For a definite set of values of ji , • • • , j p and Mi > • • * , Mp or ji , • • • , j p and 
v \, • • • , v p there will be 


0*»+i - h\ v *) = 0’*+1 ~ ~ Mi - • • • - Ps), (s = 1, •• •, p), (jp +1 = k ) 

possible distinct selections of j a + v a , (s = 1, • • • , p) events of second sub¬ 
script 8, j 8 of which are preassigned, from j s+ i events, (s = 1, • • • , p). 

Let these selections be arranged in some order for each value of s, s = 1, • • • , p, 
and let 

(58) q h i 2 ... , p (mi , M2 , • • • , Mp) 

be the event which occurs when for all values of s, (s = 1, • • • , p), the events 
of the i a th selection of j a + v a events of second subscript s all occur. 18 

18 It is understood that the j» preassigned events of second subscript s are among the jt 
preassigned events of second subscript (i > s) in the events (58). 
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A typical event (58) is 

(59) • • •, Mp) = ft IT Go + — 0 )- 

a-1 »-j«-i+r«-i+l 

There will be, for a definite j, events of second subscript s, (s = 1, • • • , p) 


(60) 


ft 0»+1 y»j Vi,), 

*—1 


Op+i 


events such as (58). 

For a definite set of values of mi , • • • , Mp there will be, for each value of s f 
(« « 1, ... , p) 

(A* - JU—1 ~ • • • — Ml J Ma)) (« == 1, 2, • • • , p) 

possible distinct selections of j s + v* events of second subscript s, j,-i + p,_i 
of which are preassigned from k events, (s = 1, * • • , p). 

Let these selections be arranged in some order for each value of s> 

(s = 1, • • • , p), 

and let 

(61) ffijtfi ••• » p (m l , M2 , “ • , Mp) 


? be the event which occurs if and only if, for all values of s the events of the 
t/ h set of j a + v , events of second subscript s all occur, (s = 1, • • •, p), and 
the first subscripts of the events of the i 9 th set of events of second subscript s 
are among the first subscripts of the events of all the selections of events of 
second subscript greater than s, ($ = 1, • • • , p). 

There will be 


(62) 


(fc; Ml , M2 , • • • , Mp) 


events (61) which may thus be obtained. 

Theorem VII. The probability that of the pK events (51) the first j $ events of 
second subscript s occur and the remaining k — j* events do not occur , s = 1, • • • , p, 
is 


7 2 7 1 7 *“7 2 k ~jl > 

P(.QpQ' v ) = E E • 2 (-1)»^ + - 

vj—0 kj—0 


(63) 


+*e 


v p ~ 0 

(72-7 l>i) Us-iiWi) 


i 1-1 


< 2-1 


-7 l>l) (7 3-7 2^2) 

2J 2J *** ^[^<l<2 ■ • ’I'pCmI 7 M2) * 

ip-0 


, Mp)]) 


where the event Q* determines the j, — events of second subscript 

a, (a * 1, • • • , p), which have as first subscripts all numbers 1, 2, • • • , j 9 which 
are not among the j 9 ~i + p«-i numbers determined by the events of lower second 
subscript than s which are contained in q it ... ip * , y p ). 

Proof. Expand (56) by means of Theorem IV. 
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Cobollary. If, for each fixed set of values of mi , m , ,n P the term s (58), 
in number (60), are all equal, then 

ys —/1 /i—/* k-iv v 

««,«„>- Z Z z (-ir + '* + ' 

(64) ri-0 »|-0 r„-0 s-1 

- M2, • • • , Mp)] O’p+1 = &)• 

Let 


(65) 


7*0*1» M2, 


’ , Mp) 5=5 2 

*,—1 if-1 i„-l 


If all the terms of (65) are equal, then 


•••* j >(mi, M2, * * * , Mp)L 


(66) T(mi, • • •, Mp) = (fc; Mi, M2, • • • , Mp)-P[?i-i(mi, • • •, Mp)1- 


Theorem VIII. The probability that of the pK events (51) exactly j 9 events o/ 
second subscript 8, s = 1, • • • , p occur, zs 


■</,...*> -££••• S (-D’ i+ ' i+ 




(67) 


Vj—o v 3 -0 


jft (m.;;. - m - ... - m.-i) 2P(mii p if •.., ^ p ). 


Proo/. If il(/ 1# is the subset of 12 determined by the requirement 

that exactly j, of the events (51) occur (s = 1, * • • , p), then 4 (J1 .,^) is the 

sum of 

i ii, J2 — ji , jz — j • • • , jp — jp-l) 

disjunct sets which may be obtained by replacing P by A in (56) and forming 
(56) for all selections of j 8 — j f _i occurring events from k — j,_i events, 
(s = 1, • • • , p). By Axiom V, P (Jlt ip) is the sum of the probabilities of 
these disjunct sets. 

Substituting from (63), it is noted that all terms (61) which depend on the 
same p $ , (s = 1, • • • , p), have the same sign and that all T(mi , M 2 , • • • , Mp) 
for which 

0 < v, < j.+1 - j », (*=!>•••, P), 

appear and only those appear. Furthermore any particular term (61) will 
occur in those of the terms (63) the j 9 — j 9 ~1 occurring events of second sub¬ 
script $, (s = 1, • • • , p), of which contain a fixed v 8 ~i events, the remaining 
j 9 — j 9 ~i — events being a subset of the y 9 events of second subscript «, 
($ = 1, • • • , p), that actually appear in the particular term (63). Hence the 
coefficient of T(m , • - • , p p ) is 

(_!)>.+..•+* 02.;,• _ w -M.-l), 


(mo = 0). 
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Corollary . If (66) is true for all sets of possible values of m , m, • • • , it, 
then 


P Ui .;,) 


it-li *ZUP 

- £ £ • •• £ (-d' i+,,+ - 


*1—0 n—0 9 P —0 


VO©; „ . . x 

Vi,Ji - Ji — V\) Vi, • • • ,Jp - Jp-1 - Vp- 1, Vp) 

Pli • • * iMp)]- 

Theorem IX. The probability that of the pk events (51) at least j 9 , but not more 
than jg+i , events of second subscript s occur, (s = 1, • • • , g), and exactly j 9 events 
of second subscript s occur , (s = g -f 1, • • • , p) is 

(69) = ££■••£ *y f+ ,... w a, *,•••, o, 

92—0 0*-O 0 a —0 

where, if a 1 in the i th position is denoted by Si, (i = 2, • • • , p), 

R(jg+i, • • * * * > &yi > 0, • * •) 0, 5 72 +i, • • • > 5 78 ,0, • • •, 0, • • *, 5n+i> * • •) &g) 

k—ip ig+2—ig + l jg + l~ig Jyi+l - hi" 1 J Tg—J 7a -1 — 1 18~ll~l 

= £••• £ £ ••• £ £ ••• £ (-ir + ’* + -+’> 

i'p-O *0 + 1-° v yr , jy i ~jy, p i“° 

(70) 0*1 + *i ~ 1; ^i) • • • On + ~ j’ 73-1 ~ ^73-1 ~ 1 ; ^ 73 ) 

0*74 + v 74 ^73 ^78 Ij ^ 74 ) * * ' (ip + ^P ~ ip-1 ^p-l? ^p) 

rOl + *'!»•• • li7a + ^78 *“ jyz-l “ *7|-1> 0 , • • •, 0 , 
iv4 "1“ ^74 i73 v 73 > * • * t fp 4" V v jp -1 *V-1)* 

Proof. We note first that there are 2 <7 ~ 1 terms in (69). Since 

(71) = iS • • • £ £ iV-x, 

x 0 —n x 2 —y 2 ^i—n 

the theorem may be proved by a process of repeated summation. From (67) 
and (71) 

M M—Xi Xj—X* A —jp 

p(/i) - V V V v r—iv» +, * + - +,r J 

~(x 2 •••x,/<,+!•• */p> — La 


Xi—J i * 1 —0 • > 2*"0 *jj—0 


(Xi + vi; + V 2 — Xi — ; v 2 ) • • • (i P + v P — ip-i — Pp-i; vp) 

T (\i + »»i, X* + — Xi — i>i, • • • ,j p + v p — jp -1 — v p -i). 

For fixed values of X*, Xa, • • • , X, there will occur in (72) all terms 

(73) T(ji + ft , Xs + vt — ji — ft , • • • , jp + Vp — jp_i — Vp-i), 

(ft = 0, • • • , X 2 - ji), (0 < v. < X, +x - X,), (s = 2, • • • , p), 

(X»+» — jg+t & — 1) • 1 ’ > V 0)> 

and any definite term (73) will occur in all 

(74) P(/l+«,A„ ■■;ip) 
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for which 

0 < a < ft . 

In (74), the definite term (73) will have coefficient 

( _iy>i-«+M+... + ;jl + a)(Xj + Vi _ ^ 

(75) • • • (;' p + v p - jp-i - v P -i , v p ), (a = 0, 1, • • • , ft), 

(ft = 0, • • • , X* — ji). 

Hence, in (72) the definite term (73), will have coefficient 
(_!)*>+'•+ - +'*(;, + ft - 1; ft)(Xj + p s - j l - ft ; p 2 ) 

• • • Op + "p - ip-i - v-i ; >'p), 


(76) 

We now evaluate 


7 > (xi, ) . -./p) — -R(X! ■■•/„)( 1). 


P (ji ii) _ V pO’i) 

X a -jj 


For any fixed values of X 3 , • • • , X„ , there will occur in (77) all terms 
T(ji + ft , is + ft — Ji — 0 i, Xa + >7 — is — ft , 

lip ip—i *v— 0 > 

for which either 0 < ft < X 3 — is ; 0 < ft < is — ji — 1 or ft = j ' 2 — ji + 7 , 
0 < 7 < X 3 — is ; 0 < ft < X 3 — jt — 7 . 

Let 0 < ft < is — ii — 1; 0 < ft < X 3 — is • Then the term (78) will occur 
in all 


i 2 ~\~o 1X3,* ‘',j p )i 


such that 


0 < a < ft . 

In (79), (78) will have coefficient 

, on . (— i)*.+*-«+'.+-"+'P(i 1 + ft _ 1; ft)(j 2 + ft - ix - ft - 1; ft - a) 
^oUj 

(X3 + vz — is — ft ; ^3) • • * (ip + — ip~i — j'p-i j j'p)- 

Hence in (77), (78) will have coefficient 

(_ 1 )3l+/Jj+-»+."+^(i i + ft _ 1; ft)(j, + ft - ix - ft - 1; ft) 

(X 3 + Vi - ji - ft ; Pa) • • • (jp + Vp - jp-i - Pp-x ; v p ), 

(01 = o, • • • , is - ix - 1), (ft = 0, • • • , Xa - is), 

(p« = 0, • • • , X,+x — X,), (s = 3, • • • , p) j 

(x»+» = i„+.), (* = l, • • •, v ~ q)‘ 


(81) 
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Now let ft“^ — jx 4 ' 7 ; 0 ;<y<X*—^i;0<ft<X* — jt — y- Then the 
team (78) will occur in all terms (79) such that 


y < a < ft, 

and in (79), (78) will have coefficient (80). Summing for a, (a = y, • • • , ft), 
we obtain as the coefficient of (78) in (77) 


and 


Hence 

(82) 


0 , 


if ft > y, 


(- 1 / ,+ ' ,+ - + ’ , 0'i + ft - l; ft)(x» + *•« - ji - ft;»») 

• • • (ip + y, - jp- 1 - xp-i; vp), if ft = y. 

•Pal-••»»> = P(x, - j p )( 1,1) + Pcx,.••/„)( 1, 0). 


If we examine (82), we note that the result of summing with respect to Xt 
has been the replacement of (76) by two sums which are similar to (76) in that 
the next summation index, in this case Xj, occurs in exactly two limits of sum¬ 
mation. If it can be shown that the two sums which occur in (82) each result 
in a pair of sums after summation with respect to X 3 , or more exactly if 


X,+* 

(83) X. + l ! 


,e 9 ) 


^2 > • • • f l) *4“ ^(X f +2,* • $2 , • • • 9 t 0) 


then the proof will be completed. 

Since the truth of (83) may be demonstrated in exactly the same way in 
which (82) has been shown to be true, the theorem is proved. 

Corollary. If (66) is true for all sets of possible values of /*i, M2, • • • , Mp 
then 


* * * ^71 ) 0, • • • , 0, 67,4.1, • • • , 672 , 0, * • • 0, • • * ) 67*4-1, • • • , dg) 
k—jp ig+t—ig + l ig + l—ig J74+I—;‘y,-1 j 2—J1—1 

= E-- E E ••• E ••• E (-1 r ,+, ” + * 

^+1-0 p g -0 v 7 t mm jy i -iyt 'l-o 

O'l + Vl - 1 ; Vl) • • • (jy, + Vy, ~ jy,-l ~ P-,,-1 ~ lj Vy,) 

( 84 ) (jy, "I" Vy, jy 1 Vy, 1 j v yj) ' ' ‘ {jp Vp jp— 1 v p— 1 j v p) 

(k;jl + V 1, • • • ,jy, + Vy, — jy,-\ ~ P-,,-1 , jy, 

+ Vy, - jy, - Vy, , • • • , jp + Vp - jp- 1 - Pp-l) 

P[Ql- l{jl + Vl, ••• ,jy, + Vy, - jy,-l - Vy,-l, 0 , • * • , 0 , 

jy, + Vy, - jy, - Vy,, ,jp + Vp- jp-l ~ Pp_l)]. 
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Let us again consider the experiments (19) and let E (i) have as possible results 

Oft (j= 1 , •••,*),<«- 1 , 2 ) (*- 1 , 2 , 

Let 

0 (i) -> 0 w (» - 1 , • • • , r), 

0,1 (7-1, •••,*), 

i.e. occurs whenever OyJ 1 occurs. Furthermore let the outcomes 

o8 ) ,okV--,o8 > 

be mutually exclusive. 

Let 

Ojt i 

occur if and only if none of 

0 ( . l) O tS) ••• 0 (i> 

'-'J* I t J 


(« = 1 , 2 ), 


occur. 

We may wish to know the probability that at least j\ of On, , On and 
at least j t , js > ji, of On , On , On occur. 

From Theorem IX this probability is equal to 

(85) P ow,) = R(l, 1 ) + fi(l, 0 ), 


where 

fid, i) = *2’ ”£ 1 (-ir +,, (ji + pi -1; *) 

vs—o v i **o 

0* + n — ji — Vi — l]v t )T(ji H- vi ,jt + vi — ji — v^, 

and 

fi(l, 0) = 2 (— l) M 0'i + vi — 1; vi)T(ji + vi). 

From (63) 

(fc; 71 -f Fj ) (fc-; x~ri; j a+M-J l-*l) 

(86) T(ji + pj, j t + vt — ji — pi) = 2 2 

<1-1 »,-l 

P[g<,.,(ii + vi ;jt + pj — ji — pi)], 

where, from (61) 

n+n y a+M 

J*2 + ^2 — il — F|) *» IT <5«,1 IT ^<*,2 > 

F—l F«-»7i4*Fi4-l 

the subscripts 

(87) «i, «», • • •, a >i+n 
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being the first subscripts of the t'i th selection of ji + n events of second sub¬ 
script 1 from 

Ou, On , • • • , , 

and the subscripts 


a 7i+>i+i t a ii+n+* > * • • t a i i+*t f 

being the first subscripts of the i* th selection of j% + v* events of second subscript 2, 
ji + vi of which are (87), from 

On, On, • • • , On . 

It is easy to see that 

PlfaiiiUi + v u ja + ~ ji — J'i)] = XI ^1 — ]C P(0*!i) — ]C P(OlVt) 

Furthermore 

(88) T(ji + vi) = £ PliuUi + 

<1-1 

where 

r f n+'i . 'l 

+ ^i)] * IT \ i — Z) -P<oLVi) f* 

<-l I* M-l J 

Substituting from (86) and (88) into (85) the desired probability is obtained. 
It may be remarked that theorems which have the same relation to Theorems 
VII, VIII, and IX that Theorems IV, V, and VI have to Theorems I, II, and 
III may be obtained without much difficulty. 
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REPLY TO MR. WERTHEIMER’S PAPER 

Richmond T. Zoch 


The attainment of rigor both in applied as well as pure mathematics is a slow 
process, and for this reason criticism of my paper, if constructive, is welcomed. 

Properties like continuity, differentiability, and dimensionality are local 
properties, that is to say a function may be continuous or differentiable over a 
certain range but not outside this range, or otherwise a function may be con¬ 
tinuous or differentiable over a given range except for singular points. 

The presence of singularities in functions does not necessarily cancel their 
utility. Thus the function y = tan x contains points where it is discontinuous, 
but ordinarily it is regarded as a continuous function and the presence of these 
singular points seldom handicaps one when working with this function. Simi¬ 


larly, the function / =» x — \ a function which satisfies all four Axioms as 

M2 

stated in Whittaker and Robinson’s book and expresses the mode of Pearson’s 
Type III curve as a symmetric function of the measures. The fact that this 
function is not differentiable along the line Xi = x 2 = x 3 = • • • = x n will never 
handicap the investigator for unless the frequency distribution is clearly skew 
the Type III curve would not be used to represent it. 

It seems that Mr. Wertheimer bases nearly all his criticisms on the tacit 
addition of the word “ everywhere ” to Axiom IV as stated in Whittaker and 
Robinson’s book. The word “everywhere” is not in the statement of Axiom 
IV and I assumed nothing else than stated in the axiom. 

If one deliberately adds the word “everywhere” to Axiom IV then nearly all 
my criticisms of previous writers are incorrect, unfair, and unjust. However, 
it does not seem that clearness and rigor in mathematics are increased by read¬ 
ing into an axiom a word that is not there. 

Consider first the criticism in my paper which remains valid even when the 
w r ord “everywhere” is added. (Schimmack uses the word “everywhere” on 
page 127 although Whittaker and Robinson do not.) Both Schimmack and 
Whittaker and Robinson proceed as at the top of page 217 of the book by the 
latter authors with the statement: “In this equation make k —> 0 then each 

of the quantities J tends to a value which is independent of the x’s • • • .” 
This statement rests on the tacit assumption that the quantities f are func¬ 


tions of k . Even if such w r ere true the use of tacit assumptions in a rigorous 
proof is objectionable, but as a matter of fact these quantities are not functions 
of k. Thus the particular proof given in Whittaker and Robinson’s book as 
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well as in Schimmack’s paper is altogether lacking in rigor even when the word 
“everywhere” is added to Axiom IV. Both Schiaparelli's and Broggi’s proofs 
appear to be entirely rigorous if the word “everywhere” is added to Axiom IV. 

In preparing my paper I assumed that no prohibition on functions which had 
singular points was contained in Axiom IV. In other words, I assumed since 
the word “everywhere” did not appear there was no valid objection to intro¬ 
duce and discuss functions with singularities. The functions I introduced are 
everywhere continuous but they are not differentiable along the line in Euclidian 
n-space defined by x\ = x» = x» = • • • = x n . They are differentiable at every 
other point in the space. 

It seems to me since Axiom IV as stated in Whittaker and Robinson's book 
does not exclude functions which are not everywhere differentiable that all my 
criticism is fair and just, and moreover nearly all my statements are correct. 
Mr. Wertheimer is entirely correct in pointing out that the words “everywhere” 
on page 181 of my paper are contradictory. As a matter of fact the whole 
paragraph beginning with line 7 on page 181 appears to me, on reexamining it, 
to be unsatisfactory. Except for this single paragraph I believe my paper to 
be rigorous, but I welcome further criticism. 

Mr. Wertheimer’s conclusions in his paragraph number 4 are clearly errone¬ 
ous. To show this, consider a function oik. Aak~*0 any one of three situa¬ 
tions may arise, namely: (1) The function may become infinite, (2) the func¬ 
tion may become indeterminate, that is it may take on any value whatever, 
(3) the function may approach a unique finite value independent of k. Neither 
Schimmack nor Whittaker and Robinson nor Mr. Wertheimer has established 
as a definite fact that the particular type of function here in question approaches 
a unique finite value independent of k as k —► 0. The truth of the matter is that 
this conclusion cannot be established because the function in question does not 
involve k either explicitly or implicitly. 

In conclusion there are two things I wish to emphasize. First, even when 
the word “everywhere” is added to Axiom IV, the proof given in Whittaker 
and Robinson’s book is faulty, but if one consults the references given there 
in the footnotes he will find two other proofs which are rigorous with this ad¬ 
dition to Axiom IV. Second, the mode of a skew bell shaped Pearson Fre¬ 
quency Curve satisfies all four axioms as stated in Whittaker and Robinson’s 
book, and the fact that these expressions for the mode are not differentiable 
along a certain line is never a handicap to the statistician. 


George Washington University. 



CORRELATION SURFACES OF TWO OR MORE INDICES WHEN THE 
COMPONENTS OF THE INDICES ARE NORMALLY DISTRIBUTED 


By Georqe A. Baker 

Indices are widely used in statistical analyses . 1 In many cases incorrect 
conclusions are drawn because indices are not uncorrelated or independent even 
though all of the component variables are independent. In a previous paper 2 
the distribution of an index both of whose components follow the normal law was 
given exactly i.e. without approximation. The purpose of the present paper is 
to give the simultaneous distribution of two or more indices when each of the 
components follow the normal law. The case for two indices will be discussed 
in detail and the extension to more indices will be indicated. 

Let Xi, Xt, and x 3 , be correlated variables each being normally distributed 
about their respective means , rth , w 3 , with standard deviations <j\, c 2 , a, 
and let the correlations between the variables in pairs be represented by r i2 , 
7*13 , r-M. Then the simultaneous distribution of these three variables will be 

1 Ilfftife — m i ) 2 - — iih)~ , Rsa(x 3 — m 3) 2 

(2*)*RW,a» 2Rl 2 d cl 


( 1 ) 


+ 2Ru - Xl - m i)( ^ ~ W a) _j_ 2R u (*LT - 

O'! 0*2 O'lO’s 

(xj - Vh){xi - m,y 


I f)p Wt — ■■«*/ V-J — ill}) , j j 

-+■ -- dX\ ( 1 x 2 dx$ 


<T 2 &3 


where 


j 1 7*12 r w 

R = |r B 1 r, 3 j 


I 7*13 7-23 1 


and Rn are the respective second order minors of R. 


1 Rietz, H. L. “On the Frequency Distribution of Certain Ratios,” Annals of Mathe¬ 
matical Statistics, Vol. VII, No. 3, Sept. 1936, pp. 145-153. 

* Baker, G. A., “Distribution of the Means Divided by the Standard Deviations of 
Samples From Non-homogeneous Populations,” Annals of Mathematical Statistics, Feb. 
1932, pp. 3-5. 
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If we make the transformation 


XI 

Z\ « 

Z\ sas ZiZi 

xz 

z% = 

Zi « 2228 

X\ 

Zi se Zi, 

X« « Zi 

dz 1 dzi dzz 

= zjdzidzjdz* 


which is certainly valid if Zi , x% , xn , are all positive, then ( 1 ) becomes 

1 1 1 TP - - N2 d 

^ 2RI 


exp. 


( 2 ) 


(27r)*i2*<ri<r*<r3 
. Rn(zi — m «) 2 


RuiziZi — m^ Rn(z2Zi — nh) 2 


2 

<r i 


*2 


+ 2/eu ~ WiXzaSs - mi) ^ (zizz - wi)(«t — ms) 


(Tl <r% 


+ 2 f 2 n 


ai<rs 


(ztZa - mi) fat — ms) 
o'ao’s 


]- 


zldzidztdtt. 


liXi,Xt, Xt are all positive the corresponding distribution of Z\ and z»can be 
obtained by integrating ( 2 ) between the limits 0 and » with respect to z t . 
If Xi, Xt, x* are all negative Zi and z* are again both positive so that in order to 
get the total distribution for Zi and z% it is necessary to add to the integral of ( 2 ) 
between the limits 0 and » with respect to z» the similar integral of ( 2 ) with z« 
replaced by — z,. The result is 

_1 c 1 »* 

2e 

(2r)‘fi 1 ' 


(3) 


2 R e 2 Ra 


<T l <T2 <?8 


f 

l _\/2 o* o* 


V* V» , . R' 

dz + 


2 * 6 a 

VlJ 


where 


fin * , fi» * , fis» , 

2 Z\ T* ^ ^2 T 2 * 

C\ <T% <Ts CTi 0*2 


2fli2 , 2fli3 , 2Rn 

- Z 1 Z 2 H- 2 i i- Z 2 


CTi<Tz 


0*2 0*8 


Ri& 


6 R\1 1 -822 1 -^83 1 R\2 . f ?12 , 

= —y ™>\Z\ + —j- m%zz H—j- ms H- Z\m% + -mi z% H-mjZi 

<T 1 0*2 0's 0 , l0 r 2 (TiO^ O'ltrs 


, Rw , it 28 , x?28 

H-mi H-m 8 Z 2 H-m* 


0i<r s 
2Ru 


C%<Tz 


O’* 0’S 


Rll 2 1 ^2S 2 . Rdi 2 I 2i2l2 2Riz . 2R& 

—j- mi H—j m2 H—j- ms H-mi m2 H-mi ms H-mama. 

cr 1 <ra <Ts <fi (Ta <ri (Tj 


<T 2 <T 8 


The same result (3) is obtained for z 1 , and 22 negative, z x positive and 2 s 
negative, 2 i negative and 2 s positive. That is (3) is the simultaneous distribution 
of 21 and 2 s. The extension to more than 2 indices is immediate. The form of 
the distribution of the indices and the denominator variable is the same as ( 2 ) 
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except that a, b, and c, the coefficients of z \, z z and the constant term respectively 
in the exponent of e } will be different in that they will include the new indices and 
the exponent on the denominator variable will be the same as the number of 
indices involved. The distribution of the indices will again be obtained by 
integrating from 0 to oo with respect to the denominator variable. 

The case when all of the variables Xi , x %, x% are independent is especially 
interesting. If , ns, r M are all zero then R = Ru = Rn = R u = 1, Rit *» 
Ru = Rts = 0 and a, 6, c, become a', 6', c', respectively. 


n . ^2 , J_ 
2 ■ 2 • 2 
fft a 3 


6 ' 

c' 


miZi , ffhZt , ms 
j I 2 r 2 
or i <T 2 <J*8 

_ 2 _ ^ _ 2 

2 * 2 2 
<ri at a$ 


Under these conditions and the further condition that Wi, nit , are large with 
respect to <n , <r 2 , <n respectively so that the integral term of (3) maybe neglected 
(3) becomes 


(4) 


( *»? m* 

-+-H 

•! i 


27TCT i <72 a$ 



1 + 


/ mi zi mtzt 

\ 2 ’ 2 

\ <fi crs 



( 4 + 4 + 4 ) 


It is clear that Zi and Zt are not independent in the probability sense for dis¬ 
tribution (4). 

The question as to the possibility of having the variables independent and the 
indices independent at the same time arises. Denote the distribution functions 
of Xi , xt , x *, by XiCxi), Xt(xt), X s (xt) and of Zi, zt by Zi(zO, Z t (zt). Then, if 
Xi > 0, i = 1, 2,3 it is necessary that 

(5) j Xi(z3Zi)Xt(zt&t)Xz(zt)zl dzz = Zi(zi)Zt(zt) 

a and b being suitable limits. 

For instance, let 

X l (x i ) = 1, 1 < zi < 3 


X % (xJ - 1 < xi < 3 

x 2 

X t (xl) = x\, 1 < x, < 2 
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then 


Zt(zd - 

2l 

ZiW = -! 

for value of z x and zt within a straight line sided area the corners of which are 
(i, i), (i, 1), (1,1) and (1, 2). z x , and 2 * are not uncorrelated throughout their 
entire set of values but are for this particular set of values. Thus is appears 
that it is possible that the indices may be independent when the variables are, 
but not necessarily so. 

Indices should be used with care since it is very easy to draw invalid conclu¬ 
sions from the consideration of them. Usually it is better to use partial corre¬ 
lation analysis to remove the influence of a third factor than to calculate indices. 



THE TYPE B GRAM-CHARLIER SERIES 


By Leo A. Aroian 

While much attention has been devoted to the Type A Gram-Charlier series 
for the graduation of frequency curves, the Type B series has been somewhat 
neglected. However the numerical examples to be presented later will show 
that the Type B series is very useful for the graduation of skew frequency 
curves. Wicksell 1 has demonstrated that the Gram-Charlier series may be 
developed from the same law of probability which forms the basis of the Pearson 
system of frequency curves. Rietz 2 following Wicksell gives a derivation of the 
Gram-Charlier series based on the binomial ( q + p) n . Jordan 3 gives a method 
for fitting Type B based on certain orthogonal polynomials which he calls G . 
He uses factorial moments because of the resulting ease in finding the values 
of the constants. 

We shall consider the Type B series for a distribution of equally distanced 
ordinates at non-negative values of r. We shall find the values of the first few 
terms of the series and shall also show how the values of later coefficients may 
easily be found. We write the Type B series in the form 

(1) F(x) = Co + CjA^(x) + c 2 A 2 \I/(x) + c 3 aV(x) + c 4 AV0*0 + c 6 Afy(*) + c 6 Afy(:c) 
where 


\l/(x ) = m = n'i, the mean, 

(2) *• 

A \f/(x) = \!/(x) — \!/{x — 1) for x = 0,1, 2, • • • s. 

Let f(x) give the ordinates of the observed distribution of relative frequencies, 
so that S/(x) = 1. To determine the coefficients Co , Ci , c 2 , • • • , c 6 , we have, 
using the method of moments, 

2[cof(x) + CiA^(x) + CsAV(x) + CjAfy(x) +.+ c»Afy(x)] = 2/(x) = 1. 

2 x[co^(x) + CjA^(x) +.+ ceAfy(x)] = 2x/(x) = m. 

2x : W(x) + CiA^(x) +.+ CeAfy(x)] = -x 2 /(x) = Ms • 

(3) 2xW(x) +.+ CaAV(x)] = 2x s /(x) = Ms • 

2 xW(x) +.+ c^V(x)] « 2x 4 /(x) = 

2 x'W(x) +.+ <% AV(x)] = 2 x‘/(x) = ms • 

2 xW(*) +. + c» aV(x)) = Sx‘/(x) = Ms • 
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Hence we must find the values of 


(4) 


£ *"A'*(x), 


n = 0 , 1 , 2,3 • • • 
p = 0,1,2.3 •• • 


defining aV(z) = ^(x). We assume that we are dealing with distributions 

oo 

in which s is large, and that the error involved in substituting 22 x n A p \f/(x) for 

x—0 

« 

22 x n A P 4'(x) is negligible. To find these summations in a straightforward 

manner would involve too much labor, so we shall briefly discuss some properties 

of the generating function, \p{x) == —r—, the Poisson exponential, very useful 

x\ 

in the graduation of frequency distributions of rare events. The first eight 
moments about the origin are: 

Mo = 1 ■» 2^(x), mi — m — 2 x$(x), mj = m + m* = 2xfy(x) 

M» = m + 3m* + m* = 2xV(x) 

M< = m + 7m* + 6m* + m 4 = 2xV(x) 

(5) n'i = m -f- 15 m* -|- 25m* + 10m* + m 6 = 2xV(x) 

m< = m + 31m* + 90m’ + 65m 4 + 15m 6 + m 6 = 2xV(x) 

Mr = m + 63m* + 301m’ + 350 m 4 + 140m 6 + 21m 6 + m 7 = 2xV(x) 
ms = m + 127m* + 966m’ + 1701m 4 + 1050m 6 + 256m 6 + 28m 7 + m* 

= 2xfy(x) 

These may be found by the formula given by Jordan,’ 


(e) + ^). 

Proof: (l) 

am m 


We multiply by x" and sum, giving (6). This result may readily be proved also 
by means of recursion formulas without differentiation. Now we must find the 
values of 


22 x n A p \p(x) 


n — 0 , 1 , 2 , 
V = 1,2,3, 


We do this by proving 

22 x'A^tix) 

x—0 


-£-22x n A‘t(x). 

am it- o 


( 7 ) 
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Now 

(8) ^ = ^ ( x -1) - m = -wix). 

Hence 

+ (*)*(x - 2) + • • • + (-1 )V(* - •)], 

since A>(x) = ^(x) - ^®^(x - 1) + - 2) + • • • + (-1)V(* - *). 

Then by (8) 

£aV(x) - [*(x - 1) - m - (j)*fe - 2) + (®)*(x - 1) 

+ (%)*<* ~ 3) - (2)^^ - 2) + • • • + (-l)V(x - s - 1) 

- (-l)V(x - «)J. 

(9) ^jAV(») = -*(*) + (* | *)*(x - 1) - (* 2 - 2) + • • • 

— (—i)V(* — * — 1). 

= - [*(*) - (* t *) *(* - 1) + (* £ *) - 2) + • • • 

+ (-I)V(X - 3 - 1)J. 

= -A* + V(x). 

We multiply (9) by x n , sum with respect to x, giving (7). 

Thus by use of (7) and (5) we get: 

2AV(*) = 0, p- 1,2,3,... 

XxA\p(x) = — ^ = —1. 
dm 

(10) Sa:%Ka:) - “Si 2xV(x) = -^ (rn + m 1 ) = -2m - 1. 

2x*A^(x) =b —3m* — 6m — 1. 

2x 4 A^(x) = —4m* — 18m* — 14m — 1. 

2xV(x) = -5m 4 - 40m* - 75m* - 30m - 1. 
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2x # A*(x) = -6m 6 - 76m 4 - 260m 8 - 270m 1 - 62m - 1. 

2xAV(x) = 0, 2 x 2 aV(x) = 2, 2x 8 aV(x) = 6m + 6. 

2x 4 aV(x) = 12m* + 36m + 14. 

2x 8 aV(x) = 20m 8 + 120m 2 + 150m + 30. 

2x 8 aV(x) = 30m 4 + 300m 8 + 780 m 2 + 540m + 62. 

2xAV(x) = 0, 2x s Afy(x) = 0, 2x 8 AV(x) = -6. 

2x 4 AV(x) = -24m - 36, 2 x 6 aV(x) = -60m 2 - 240m - 150. 

2x*aV(x) = -120m 8 - 900m 2 - 1560m - 540. 

(10) 2xAV(x) = 0, 2x 2 AV(x) = 0, 2 x 4 aV(x) = 24. 

2x 8 AV(x) = 120m + 240, 2 x 8 aV(x) = 0. 

2x 6 aV(x) = 360m 2 + 1800m + 1560. 

2xAV(x) = 0, 2xAV(z) = 0. 

2x s Afy(x) = 0, 2x 2 AV(x) = 0. 

2x 8 AV(z) = 0, 2 x 8 aV(x) = 0. 

2x 4 Afy(x) = 0, 2x 4 AV(x) = 0. 

2x 6 AV(x) = -120, 2x 6 AV(x) = 0. 

2x 6 aV(x) = -720m - 1800, 2 x 6 aV(*) = 720. 

Finally we substitute from (5) and (10) into (3), and for m! we substitute 
m! = (^j ii n - r m T . Hence 

Co = 1 

Cj = 0 

C2 = i 0*2 - rrt). 

(11) c 8 = —~ (ms — 3m 2 + 2m). 

C 4 = -jj [/x 4 — 6M3 + M2(ll — 6m) + 3m (m — 2)). 

c 6 = —[/X 5 — IO/Z4 — Ms(10m — 25) + 50/i2 (m — 1) — 4m(5m — 6)]. 

5! 

Co = i [mo — 15mo + m<( 85 — 15m) + ms( 130 m — 225) + Ms(45m 2 — 375m 
6! 

+ 274) — 15m 8 + 130m* — 120m]. 

It may be asked whether criteria may be given as guides for the use of Type 
In general Type B may be tried if either the skewness of the distribution to 
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fitted is considerable, <* 8 = ^2 > .6, or if m = M2 = M3 approximately. The 

M2 

latter condition strictly would mean that ^(x) alone is sufficient for a good 
graduation, if the fourth moment, M4 , is not used. The examples which follow 
are arranged to facilitate comparison with the Pearson system of frequency 
curves. We have an example each of Type I, III, IV, V, VI, and an example of 
the normal curve. 

Type I. Table 1. Here a 8 > 6 although m j* M2 ^ M3 • The first four 
moments, unadjusted, give an excellent fit by Type B, which is not quite as good 
as Type I. The degrees of freedom, according to Fisher, 4 have been taken into 
consideration here in applying the x 2 test. The two classes 13, 14, were grouped 
together for the x 2 test. The actual numerical work is easily done on a cal¬ 
culating machine, although logarithms are necessary to find the value of e~ m . 
This example and the remaining are all taken from Elderton 5 with the exception 
of Type IV which is from A. Fisher. 6 

Type III. Table 2. The unadjusted moments are used. Here <* 3 = 2.0833 
> .6, and m = *i 2 approximately. The fit by Type B is slightly better than that 
by Type III. We have for Type III P(x 2 > 12.8) = .007, n = 3, while for Type 
B, P(x 2 ) > 9.4 = .025 n = 3. Moreover the standard error of prediction for 
Type III is 11.2 and for Type B is 7.7. 

Type IV. Table 3. The rough moments were used. Although a 3 = .48 < .6, 
Type B gives a fine fit since m = m 2 = M3 approximately. Here the results are 
given for Type B using 2, 3, and 4 terms of the series. This was done to show 
how the distribution changes with the addition of more terms. The superiority 
of Type B over Type IV is evident. The results for Type IV are taken from the 
class notes of Professor C. C. Craig. 

Type V . Table 4. Using the adjusted moments we have a comparison among 
Types V, A, and B. While the graduations may seem satisfactory, the x 2 test 
shows that the fit is poor in each case. The order of merit is Type V, Type B, 
and then Type A. The negative frequencies which appear in Type B may be 
due to the use of the adjusted moments. If we u§e the rough moments, the 
negative frequencies disappear. On the whole the fit by means of the adjusted 
moments is superior. 

Type VI. Table 5. Type VI using the adjusted moments gives an excellent 
fit. Even though a 8 is considerable, and M2 = ms approximately, four moments 
with Type B give a poor fit, and five moments, adjusted, achieve a very small 
gain. Five moments using the unadjusted moments give some improvement, 
but the — 2 frequency in the first class is objectionable. 

Normal Curve. Table 6. The normal curve provides a fine fit. P(x 2 > .9) = 
.96, n = 6. The first two and the last two classes were grouped together for the 
test. The fit by Type B is less probable, P(x 2 > 8) = .15, n = 5. Type B has 
two discrepancies, the negative frequencies, and the fact that the total fre¬ 
quencies (neglecting the —1) is 352. That Type B does so well is in itself 
quite amazing! 
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TABLE 1 


X 

Actual frequency 

Frequency computed 
by rearson Type I 

Frequency given 
by Type B 

0 

34 

44 

42.4 

l 

145 

137 

121.3 

2 

156 

149 

168.7 

3 

145 

142 

156.8 

4 

123 

127 

120.5 

5 

103 

108 

94.9 

6 

86 

88 

82.9 

7 

71 

69 

72.2 

8 

55 

51 

56.7 

9 

37 

36 

38.0 

10 

21 

24 

23.1 

11 

13 

14 

12.0 

12 

7 

7 

5.7 

13 

3 

3 

2.4 

14 

1 

1 

.9 


m = 4.175 
M2 = 7.66237 
M» - 15.1069 
M« = 173.326 


a, = .712247 
a< = 2.95214 
c, = 1.74368 
c s = - .078298 
c« = + .094592 


Type I P(x s > 4.36) = .88 
n (number of degrees of 
freedom) = 9 

Type B P(x* >9.67) = .37 
n = 9 


F(x) = *(*) + 1.74368 A V(*) - .078298 AY(*) + .094592 AV(x). 


TABLE 2 


X 

Actual frequency 

Frequency computed 
by Type III 

Fequency by 
Type B 

0 

44 

59 

48.1 

1 

135 

111 

121.6 

2 

45 

45 

58.5 

3 

12 

20 

10.4 

4 

8 

9 

3.5 

5 

3 

4 

4.3 

6 

1 

2 

2.9 

7 

3 

1 

1.2 


m = 1.33466 a 3 =-^= 2.0833 Cj = .05356 
M2 * 1.44179 w c» * - .32510 

Mi = 3.60662 

F(z) = ^(i) + . 05356AV(*) - .32510AV(x) 
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TABLE 3 


Number of alpha particles from a bar of polonium in intervals of i of one minute 


* 

Frequency 

Type IV 

Type B 

2 terms 

TypeB 

3 terms 

TypeB 

4 terms 

0 

57 

50 

49.5 

49.0 

58.2 

1 

203 

183 

201.3 


199.8 

2 

383 

392 



386.1 

3 

525 

544 

532.3 

533.8 

523.9 

4 

532 

539 


521.5 

532.1 

5 

408 

417 



418.2 

6 

273 

250 

254.8 

254.4 

260.2 

7 

139 

131 

137.1 

136.7 

134.0 

8 

45 

61 


63.9 

56.7 

9 

27 

26 

26.1 

26.2 

22.9 

10 

10 

12 

9.4 

9.6 

8.6 

11 

4 

4 


3.1 

3.6 

12 

0 

1 

.9 

.9 

1.6 

13 

1 

0 

.2 

.2 

.8 

14 

1 

0 


.0 

.3 


m = 3.87155 a, = .47844 

U» = 3.69477 a 4 = 3.506536 

Mi — 3.39791 
M4 - 47.86888 

F(x) = t(x) - .08839AV(x) - .00930AV(*) + .16810AV(x). 

Type B, 4 terms P(x s > 4.50) = .72, n = 7 
Type IV P{x* > 10.8) . .15, n = 7 
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TABLE 4 


Mortality Among female Nominees 


X 

Deaths 

Elderton 
Type V 

Type A 

Type B 

2 terms 

Type B 

3 terms 

Type B 

5 terms 

Type B 

5 terms 

0 

4 

4 

2 

1.4 

-6.9 

-.4 

4.1 

l 

18 

10 

15 

26.3 

7.1 

9.4 

13.1 

2 

53 

80 

78 

109.7 

100.1 

84.6 

77.4 

3 

265 

261 

235 

248.3 

268.4 

252.3 

242.5 

4 

438 

441 

426 

379.5 

418.8 

425.9 

427.4 

5 

525 

480 

521 

432.7 

461.0 

484.0 

494.1 

6 

342 

381 

411 

388.8 

388.4 

402.6 

408.1 

7 

253 

247 

225 

285.4 

263.5 

259.0 

253.9 

8 

128 

137 

107 

170.8 

145.5 

132.2 

124.9 

9 

82 

68 

66 

84.3 

68.3 

58.6 

54.1 

10 

28 

32 

44 

32.9 

28.2 

26.2 

26.4 

11 

12 

14 

22 

8.6 

11.0 

13.9 

16.4 

12 

8 

6 

8 

-.01 

4.7 

8.2 

10.7 

13 

5 

3 

2 

-2.1 

2.1 

4.3 

5.9 

14 

1 

1 

0 

-1.5 

1.3 

2.0 

2.5 


Adjusted moments: 
m = 5.30435 a 8 = .703564 
Mi = 3.573345 a 4 = 3.996196 
M» = +4.752437 
- 51.02659 
Mt * 193.439125 


Rough moments: 
m = 5.30435 
d 2 = 3.65668 
£>s = 4.752437 
v 4 = 52.85276 
= 197.39949 


Type A: /(<) = v (t) + .117261 <p\t) + .041508^(0 
Type B: F(x) = t(x) - .86550AY(*) - .77352AV(x) 

+ .02814AVC*) + .57459Aty(z) 

Using uncorrected moments 

Type B: F(x) = t(x) - .82384AY(x) - .73185AV(a:) 

+ .03192AV(s) + .94033AV(x) 

(last column above) 
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TABLE 5 


X 

Frequency 

Type VI 

Type B 

4 terms 

Type B 

5 terms 

0 

i 

1 

-9.5 

-2.0 

1 

56 

50 

83.2 

69.9 

2 

167 

168 

141.6 

143.1 

3 

98 

100 

102.3 

110.7 

4 

34 

36 

41.5 

40.2 

5 

9 

10 

8.7 

4.6 

6 

2 

2 

.05 

2.0 

7 

1 

.5 

-.4 

1.0 


Corrected moments: 
m = 2.402174 
Ms = .928835 
Ht = .893096 
in = 4.088800 


Rough moments: 
m = 2.402174 
Ht « 1.012169 
Ms = .893096 
Ms = 4.313176 
Ht = 11.28304 
a, = .87704 
a 4 = 4.2101 


Type B, adjusted moments: 

F(x) = *(x) - .73667AV(x) - .48516AV(x) - .06424AV(*) + .10365A^(x) 
*Type B, rough moments: 

F(x) . i(x) - .69805AV(a:) - .44654AV(x) - .06587AV(*) + .15165A^(x) 

* This is used in last column of above. There is a slight error here, which however will 
not affect the results materially. The third decimal place may be slightly wrong. 
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TABLE 6 


Normal cum 


X 

Frequency 

Normal curve 

TypeB 

0 

.6 

.6 

2.3 

1 

2.8 

2.7 

4.7 

2 

11.5 

10.9 

8.7 

3 

27.7 

30.1 

25.2 

4 

59.1 

58.4 

55.2 

5 

84.7 

80.1 

79.5 

6 

74.1 

76.9 

80.1 

7 

50.5 

52.2 

58.1 

8 

23.2 

25.0 

29.7 

9 

12.2 

8.4 

8.6 

10 

1.3 

2.4 

-.9 


Moments corrected: 

m = 5.393443 

M = 2.769635 

Ut - .029805, Hi - 22.40663 

a* = .0064 

a 4 = 2.920997 


TypeB: F(x) = +(x) - 1.3119AV(») - .4179AY(x) + 2.1625AV(*) 

Colorado State College 
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A TEST OF A SAMPLE VARIANCE BASED ON BOTH TAIL ENDS OF 

THE DISTRIBUTION 

By John W. Febtig 

With the assistance of Elizabeth A. Proehl 1 

(1) Introduction 

In testing the hypothesis, say Ha, that an observed sample E of size N has 
been drawn from a normal population for which the standard deviation, <r, has a 


particular value, <r 0 , one may form the ratio 

v = S (Zi - mf/el = .(I) 

i-1 ffO 

if the population mean m be known, or 

v' = 8 (ft -£)Wo = ^ .(II) 

t -1 *0 


where x is the sample mean, if the population mean be unknown. The proba¬ 
bility of obtaining a larger (or smaller) value of v or v f than that observed may 
readily be obtained from the appropriate tail area of the x distribution with 
n = JV or n = (iV — 1) degrees of freedom respectively. The alternative 
hypotheses to Ho concerning the normal populations from which the sample 
may have been drawn assign different values to <r and form a set of hypotheses, 
$2. The members of Q may be classed according to whether they specify 
cr > (To , or <r < (T 0 • The practice of regarding only one tail of the distribution, 
the upper or lower depending on whether v > N or v < N } is tantamount to 
accepting as admissible alternatives to H 0 only one of the classes of S2. 

The alternatives may sometimes be limited to one «jlass or the other through 
some a priori knowledge, or the problem may be such that only one of the classes 
is relevant. However, since this is not generally the case, some method of 
considering all of the alternatives is needed. When testing hypotheses con¬ 
cerning the mean of the sampled population, the problem is quite simple, since 
the distribution of means is symmetrical. Thus, the “corresponding” value to 
any positive deviation, (x — m), is the negative deviation of the same magnitude. 
Merely doubling the tail area pertaining to either of the deviations will serve to 
take account of both classes of alternatives, i.e., those in which m > m o and 
those in which m < m o. The problem is more difficult in the case of v or v f , 

1 From the Memorial Foundation for Neuro-Endocrine Research and the Research 
Service of the Worcester State Hospital, Worcester, Massachusetts. 
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since the distribution is not symmetrical. In addition to the value of v or v f 
pertaining to the observed sample we require a “corresponding” value at the 
other end of the distribution. The definition of “corresponding” which is 
Accepted will determine the required value. There may be a number of such 
definitions but not all of these will be equally acceptable. The value of v 
which delimits an equal tail area specifies one of the possible definitions of 
“corresponding.” Another definition would require that the ordinates at the 
two values of v be equal. 

The Neyman and Pearson Approach. Generalized procedures for testing 
statistical hypotheses have been elaborated in recent years by J. Neyman and 
E. S. Pearson (1-5). These have considerable philosophical appeal and will be 
traced as a basis of solution of the immediate problem. A test of a hypothesis 
Ho consists essentially of a rule for rejecting Ho when the observed sample E 
falls within a suitable critical region w of the A-dimensioned sample space W, 
and of accepting Ho when E falls in (W — w). In testing any hypothesis two 
types of error may be made: 

i) Ho may be rejected when it is true; 

ii) Ho may be accepted when some alternative hypothesis, Hi , is true. 
Errors of the first kind may be considered “equivalent” since, if a true hypoth¬ 
esis is to be rejected, it is immaterial which one is chosen. Furthermore, the 
first type of error can be controlled through our choice of the size of w, say a. 
The size of w represents the probability of a sample E being an element of w 
when the hypothesis Ho is true. This probability may be designated briefly as 
P{Eew\Ho}. Then 

P[Etw\H,\ _ j ... J p(E | Ho) dxi dx 2 • • • dxs = a . . . (Ill) 

where p(E | Ho) is the elementary probability law of the sample when Ho is 
true, i.e., 

p(E | Ho) = p{x, ,xt, x N \ Ho) .(IV) 

Errors of the second type, however, are not equivalent, since their consequences 
depend on the difference of the true hypothesis from Ho . The utility of a test 
of Ho will depend largely on how it controls the second type of error. Ideally, 
the selection of a critical region should take into consideration the probabilities 
& priori of the hypotheses composing 12. Since these probabilities are generally 
unknown, tests may be sought which are valid independently of them. 

A distinction must be made between simple hypotheses which specify com¬ 
pletely the elementary probability law of the sample, p(E) f and composite hy¬ 
potheses which specify the law subject to one or more undetermined parameters. 

(2) Simple Hypothesis Concerning Population Variance 

A test based on a critical region Wo may be called independent of the probabili¬ 
ties & priori of the alternative hypotheses if it is more powerful than any other 
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equivalent test for all of the alternative hypotheses (3). An equivalent test 
is one based on a region w% of the same size, a, i.e., 

P[Eew Q \Ho} = P{Etw x \Ho] = a..(V) 

The power of a test based on any critical region, as w x , is the probability of its 
rejecting a hypothesis Ho when some other hypothesis Hi is true. That is, 
it is the probability of E falling in Wi when Hi is true. Denote this power by 
P{E ewx | Hi). The greater the power of a test, the smaller the risk of the 
second type of error. If tests as defined above exist, they minimize the proba¬ 
bility of the second type of error. Furthermore, the probability of the first 
type of error is no larger than a. Neyman and Pearson (2) have designated 
regions satisfying this definition as Best Critical Regions for testing Ho with 
regard to the set ft. If there is no such Best Critical Region, some compromise 
region must be chosen. 

A necessary and sufficient condition for w 0 to be a Best Critical Region with 
regard to an alternative Hi is that within w Q 

p(E | Ho) < kp{E | Hi) .(VI) 

where k is some constant depending on a. If this inequality is true for any Hi , 
Wo will be a Best Critical Region for the set ft. 

Neyman and Pearson (2) have shown that in testing the hypothesis that 
a = <to , when the population mean m is known, there are two Best Critical 
regions, one pertaining to the class of alternatives for which <r < <r 0 and defined 
by v < Vi , the other to the class a > <r 0 defined by v > t> 2 . Vi and v% are values 
of v so chosen that the size of the critical region shall be a. Although there is 
no Best Critical Region for all of the alternatives, the choice of a compromise 
critical region should still depend on its control of the second source of error, 
that is, on its power for the various alternatives (4). Such a compromise 
region may be designated as a Good Critical Region. What is needed is a 
region w 0 of size a defined by the inequalities v < V\ and v > v 2 . If V\ and v* 
are taken as the values cutting off equal tail areas, then the power of the test 
will be less than a for some values of a less than <r 0 . For those values of <r, Ho 
would be accepted more frequently than if it were true. Thus a first require¬ 
ment for a Good Critical Region is that its power should nowhere be less than a, 
the value when H 0 is true. Of all such unbiassed Critical Regions of size a, 
w 0 should then be selected so that its power is everywhere greater than that of 
any other equivalent unbiassed region. 

Critical Regions sufficiently satisfying the above requirements can often be 
obtained by stipulating that the first derivative of the power function with 
respect to 0, the parameter under consideration, shall be zero at 0 = 0 O , and 
that the second shall be a maximum there. Then not only does the probability 
of the second source of error decrease as we move away from 6 0 , but it decreases 
most rapidly in the vicinity of 0 O . Critical Regions satisfying these conditions 
are called unbiassed Critical Regions of Type A, (4). Under certain assumptions 
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concerning the nature of the elementary probability law p(E | 6) it can be shown 
that Wo is defined by the inequalities <pi < c\ and <pi > c» where ci and Ct satisfy 
the conditions 



p(ifii) dtp i = 1 — 


a 


(VII) 


/' 


tpipM d<pi = 0 


(VIII) 


where 


tpi = 


d log p(E I e) 
d$ 




.(IX) 


and p(<pi) is the distribution function of ^ . 

In applying these results to the testing of the hypothesis that a = <r 0 l when 
the population mean is known, 

«*«(*- AO/Vo.(X) 


Obviously p(t>), the distribution of t>, may be considered instead of p(<pi). w 0 is 
defined by the inequalities v <V\ and v > v* where 


/ p(v) dv + f p(v ) dv as 
Jo 

/’ 


ai + «2 = a 


N/2-vf2 


(v — i\T)p(t;) dt; = e’ 


t» 2 


= 0 


..(XI) 

.(XII) 


w>o so defined is also of type Ai, that is, its power curve lies everywhere 
above that of any other equivalent region, vanishing in the first derivative at 
a = ao, (4). 

The use of w 0 as the appropriate critical region is equivalent to the use of r 
as a test criterion, where 

v Nn e~ iv = r).XIII) 


That is, a value of v yielding the same r as the observed v may be taken as the 
corresponding value. Reference to the appropriate tables and summing of the 
two tail areas gives P r , the probability of obtaining a smaller value of r when 
Ho is true. Ho may be rejected if P r is less than some previously fixed number, 
say a. If the distribution of r could be evaluated the necessity of dealing with 
two values of v would be obviated. 

The criterion r is equivalent to that deduced by the use of maximum likelihood 
ratios (6). Thus, 


p(E\^) 


AT 

(2* cY Nli e~ iil 


Or*'—m)*/2<r 8 


(XIV) 


* The solution is the same in terms of <r 2 . 
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Maximizing p(E | a 1 ) for fixed E and all possible <r 2 we have 


/'max. \—» i w / 


L" i-i 


" v / j 


y{E\ut) _ xr-JST/2 AT/2 V) 

“pnu ZZEW)- 

- N~ Kn e Nn r . 


(XVI) 

(XVII) 



The A th moment coefiicient of X about zero, m((X), is given by 
r p(l + A) 1 

./J\\ = L 2 J f2c /JV^ w /*fl 4 - Ar" <1+ * )/2 .(X 

iW2) 








TABLE I 

Probability that a sample has been drawn from a normal population with a specified variance or standard deviation 
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For N infinite, (—21og,X) will be distributed as % with one degree of freedom. 
For finite values of N t however, we have not been able to evaluate the dis¬ 
tribution of X, although the distribution of the Incomplete Beta Function serves 
as a good approximation. Approximate distributions for several values of N 
have been obtained. Px, the probability of obtaining a smaller value of X 
than that observed, as obtained from these distributions agrees well with the 
sum of the tail areas pertaining to v\ and yielding the same value of X (or r). 
The construction of tables is simplified by taking (1) 


log 10 X = AT/2(log 10 e - k) .(XIX) 

That is, 

x - log* x = k log* 10.(XX) 


where x = v/N. Equation (XX) is independent of N and may be solved once 
and for all for x, given k* In Figure 1 is plotted the graph of equation (XX). 
For convenience, the branch of the curve giving the roots greater than unity 
has been folded back with altered scale from the minimum value of fc, logioe, 
occurring at x = 1. Table I was then constructed by multiplying the two 
values of x for a given k by (N/2) *, referring to the Tables of the Incomplete 
Gamma Function (7) with p = (N — 2)/2, and adding the resulting two tail 
areas. The values for the odd numbers above 12 were obtained by interpolating 
between the even numbers. For N = 1, (x)* was used as a normal deviate. 
The values in Table I should be correct to four decimals. Table I is entered 
with the number of degrees of freedom, n, on which x is based. In the case of the 
simple hypothesis this is N. 

The following may serve as an illustration: Blood urea nitrogen determinations 
(mg./lGO cc.) were made on a sample of 25 schizophrenic patients. The mean 
was found to be 15.56, the variance, 10.486. Previous investigation of blood 
urea nitrogen on a large sample of normal control subjects gave a mean of 16.03 
and a variance of 20.268, which for the purpose of the example may be considered 
as the population parameters. Then we may wish to test the hypothesis that 
the variance of the sampled population, <r 2 , is al = 20.268, knowing the mean 
of the sampled population to be 16.03. Calculate 

- * 2 + (*r-™> 2 .528 

Referring to Fig. 1, the value of k is about .505. Turning to Table I with 
k = .505, 4 n = 25, P is found to be .0457. We should thus be inclined to reject 
the hypothesis. 

For N small, the area of the tail of the distribution near zero is considerably 
larger than that at the upper end. As N increases the distribution of v becomes 


* If the solution were explicit the distribution of X could easily be deduced from that of x. 
4 k obtained directly from (XX) is .507, corresponding to P * .0427. 
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more and more symmetrical and the two areas approach equality. Even for 
N — 50, however, they are rather unequal, so that merely doubling the area 
pertaining to the observed t> does not give a sufficiently accurate approximation. 
For N > 50 an approximation correct within several units in the third decimal 
place may be obtained by taking y /2 N(y/x — 1) as a normal deviate. This 
assumes that the standard deviation is normally distributed with variance o \/2N. 

(3) Composite Hypothesis Concerning Population Variance 

Here Ho specifies only the value of the parameter 6 = 0 O , leaving undetermined 
the value of a second parameter, v. Thus, H 0 consists of a subset, «, of simple 
hypotheses, each of which specifies a different value for v. Any simple hypoth¬ 
esis specifying different values of both parameters, 8 and v, is an alternative 
to Ho . These alternatives form the set 0. The elementary probability law 
determined by Ho is p(E \ Ho) = p{E | 8 0 v), while that determined by an alterna¬ 
tive hypothesis Hi is p(E | Hi) = p{E | dpi). In testing composite hypotheses 
the first requirement is to find regions “similar” to W with regard to v, i.e., such 
that the chance of rejection of a true hypothesis, P\E tw \ Ho], equals a. for all 
the values of v specified by the simple hypotheses composing H 0 . A test based 
on a similar region Wo may be called independent of the probabilities & priori, 
if its power with respect to all the alternatives of S2 is greater than that of any 
other similar region toi of the same size, a, (3). Let 

<Pt = d log p(E 1 8v)/dv . .(£ci) 

Then the equations v* = constant will describe hypersurfaces in iV-dimensioned 
space, on one of which the observed E must fall. Under certain assumptions 
pertaining to the law of elementary probability it can be shown (2) that a 


necessary and sufficient condition for w to be a similar region is that 

P[E ewfo) | Ho] = aP{E eW(<p 2 )\H 0 } .(XXII) 

for all values of ^ , where w{&) and W(<p 2 ) are parts of the surface = constant 


common to w and W respectively. A similar region is then built up of these 
parts w(<pz) obtaining for the various values of & . The Best Critical Region, 
wo , for a particular simple alternative, Hi , must then be composed of pieces, 
Wo(<p2) f maximizing P{E e io 0 W | Hi}. The problem is the same as for simple 
hypotheses except that we shall be working in a space W (<^ 2 ) of (2\T — 1) dimen¬ 
sions. wo(<ps) is defined by the inequality 

p(E | Hi) > fcfe) p(E | Ho) .(XXIII) 

where fcfo*) is some constant depending on a. If WoO#*) is the same for all Hi , 
then Wo is the Best Critical Region for testing Ho with respect to 12. 

Neyman and Pearson showed (2) that in testing the composite hypothesis that 
<r = (To when the population mean is unknown there are two Best Critical Regions 
corresponding to the class of alternatives <r < <r 0 and <r > <r 0 , defined respectively 
by the inequalities v' < v[ and t/ f > v% . If the whole set of alternatives, 12, is to 
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be considered some compromise region must be sought. Dealing with the case 
where similar regions exist Neyman (5) defines a Critical Region as unbiassed 
and of Type B if the first derivative of the power function, P(E < w | Hi), with 
respect to 6 vanishes at 0 = do , and if the second derivative at that point is a 
maximum. Let 


p i - 


d log p(E | Bv) 

96 |«-« 0 


.(XXIV) 


Then it can be shown that the desired region will be defined by the inequalities 
pi < ki(<pt) and pi > ft*(pj) where fcifo) and ki(<pt) are determined to satisfy 


and 


rk 2 (<Pi) 

/ p(p,p,) dpi = (1 - a)pW.(XXV) 

Jk j (1P2 > 


rk tin) 

/ pip(pips)dpi = (1 — «) / Pi p (pips) dpi.(XXVI) 

Jk\(v> 2 ) J- oo 


where p(<^) is the distribution function of , and p(<pi<p 2 ) is the simultaneous 
distribution of <pi and . 

Applying equations (XXV) and (XXVI) it follows that the appropriate 
Critical Region is defined by the inequalities v' < v[ and v' > v 2 where 


and 


a = ay + ol 2 




(XXVII) 


/(AT-1)/2 -Jr 
V e 


0 


(XXVIII) 


where p(t/) is the distribution function of v'. 

The use of the unbiassed Critical Region of Type B corresponds to adopting 
as a criterion 

v ,ix-w e -h’ = /..^rar|X) 

Since v 1 derived from a sample of size N is distributed as v derived from a sample 
of size (N — 1), it follows that r' is equivalent to the r of equation (XIII) based 
on a sample of size (N — 1). Therefore Table I may also be used for testing 
the hypothesis that a — <to whatever be the population mean, by entering with 
the number of degrees of freedom, iV — 1. 

In the example previously used, compute 

8 S 

x = - 0.517 

*0 

From Figure 1, A; is approximately .51, corresponding to P = .0422. 
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r' is not the same as the maximum likelihood ratio X' (6). 

X' sb P”*** (jj[ 1 a ° = J y-Nl* v Wl2 e -\(v'-*t) = Jtf~*l 2 e Nl2 v A r ' (XXX) 

Pnu«(F|<r 2 W) 

As JV becomes infinite the distribution of X' is the same as that of the X of (XVI). 
For N sb 49, the probabilities corresponding to X' agree with those using r f to 
within a unit in the third decimal. 

The X' test is biassed as may be seen in Figure 2 where we have plotted the 
power of the test based on the region w defined by v[ = 3.187, v% = 22.912 for 
which a = .0436 + .0064 = .0600, on the assumption that al = 1.0, for N = 10. 
Although the criterion is biassed it is slightly more sensitive to alternatives 



Fig. 2, Comparison of Critical Regions for v’. Ho Specifies <rl — 1.0. N = 10. 

specifying <r 2 < <rl than is the unbiassed Critical Region of Type B defined by 
v[ = 2.953, v f 2 = 20.305, a = .0339 + .0161 = .0500. The criterion of con- 
stant distribution, p(v f ), 

v ,iN-zm e -iv ^ c , .(XXXI) 

has also been considered. In this case v[ = 1.903, i4 = 17.391, a = .0071 + 
.0429 = .0500. This criterion is biassed for some alternatives specifying 
<r 2 < a o, but its power curve lies above that of the unbiassed region for <r 2 > c \. 

Apparently the bias may be shifted at will by changing the exponent of v\ 
This may be desirable if greater weight is to be given to one class of alternatives. 
In fact decreasing the exponent of v ' to 0 produces the Best Critical Region 
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for the class of alternatives specifying a > <r\ , and defined by = 0,vt- 16.919 
for a = .0500. No region can be found giving greater power. On the other 
hand this region is insensitive to alternatives of the other class. Increasing the 
exponent indefinitely produces the Best Critical Region for the other class 
defined by a* * « and v[ = 3.325 for a = .0500. 
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ON THE POLYNOMIALS RELATED TO THE DIFFERENTIAL EQUATION 

1 dy _ Op + cl\X N 

y dx ~ bo + fax + fax 2 D 


By Frank S. Brale 

Introduction. In a previous issue of this Journal, 1 E. H. Hildebrandt has 
established the existence of a general system of polynomials P„(fc, x) associated 
with the solutions of Pearson's Differential Equation 


CR) 


1 dy __ N 
y dx ~~ D* 


N and D being polynomials in x of degrees not exceeding one and two respectively 
with no factor in common. 

It was shown that the polynomials P n (fc, x) ss P n themselves satisfy certain 
differential equations and a recurrence relation. The classical polynomials of 
Hermite, Legendre, Laguerre, and Jacobi are special types of P n (k , x). Since 
the classical polynomials are employed rather extensively in statistical theory, 
certain of their properties are of special interest. 

It is the purpose of this paper to determine from Hildebrandt's general equa¬ 
tions some new properties of P n (fc, x) and to apply these properties to the 
classical polynomials. The paper consists of two parts. In part I some 
theorems are established concerning common zeros of D and P n . In particular, 
a theorem is established to exhibit the conditions under which the zeros of P n , 
which are not zeros of D, are simple. In part II a method is outlined for the 
classical polynomials by which one can determine the number and location of 
the real zeros in the various segments into which the zeros of D divide the x axis. 
The points of inflexion and the degree of the polynomials are also considered. 

A new feature of the method employed is, we believe, its being based upon the 
use of differential equations of first order, for most part, while other investi¬ 
gators 2 have employed differential equations of second order. As to the results 
obtained, the author believes them to be partly new. They have points in 
common with the results of Fujiwara, Lawton and Webster. 


1 Systems of Polynomials Connected with the Charlier Expansions, etc., Annals of Math. 
Stat., Vol. II, 1931, pp. 379-439. 

* M. Fujiwara: On the zeros of Jacobi’s Polynomials, Japanese Journal of Math., Vol. 2, 
1926, pp. 1, 2. 

W. Lawton: On the zeros of Certain Polynomials Related to Jacobi and Laguerre Poly¬ 
nomials, Bull. Am. Math. Soc., Vol. 38, 1932, pp. 442-449. 

M. S . Webster: Thesis, Univ. of Penna. These results were kindly communicated to 
me by Dr. Webster. 
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I. Theorems Concerning Common Zeros of P„(fc, x) and D 
The following equations will be employed later: 

(1) P. + i(k, x) = [AT + (k - n)D'] P n (fc, x) + DP' n (k, x). 

(2) P' n+1 (k, x) - (» + 1) [jST' + D"J P„(fc, x). 


P»+i(M) - [N + (ft — n)D']P„(fc, x) 

(3) + n[V + 2fe ~ 2 ra + 1 £"]iJP-^fc,*). 

These are not explicitly given in Hildebrandt’s Paper but the method of obtain¬ 
ing them is outlined there in detail. 

We shall make use of the following lemma which we state without proof. 
Lemma (1). Let P n (x) be a polynomial of degree n. If both P n and P* n contain a 
factor (x — a) m , m < n, then P n contains the factor (x — a) w+1 . 

We also need an expression for P ( n +i(fc, x). By repeatedly differentiating (2) 
and eliminating Pl(fc, x) we get, 


P { n+l(k, x) as il (n + 1 — i) N' + 
*-o 


2 k — n + i 


i (k, x), 


(n + 1) 


Theorem I x . If D is a perfect square, D ' is not a factor of P n+ i ( k , x), n = 
0,1,2,... 

Proof: Assume D' to be a factor of P n+i . From (1), D' is either a factor of 
P n or of N + (k — n) Z>'. But D' is not a factor of N + (k — n) D' as this 
implies that D f is a factor of N contrary to hypothesis on ( R) that D and N 
have no factor in common. Thus, D f is a factor of P n , and by a repetition of the 
reasoning a factor finally of Pi, which as it was just pointed out, is impossible. 

Theorem J 2 . Set D = (a x x + ft) (c^x + ft), D not a perfect square. If 
a# + ft , i = 1 or 2, is a factor of P n , then ( a& + ft) fl is a factor of P n +«-i, 
q =s 1, 2, 3, • • • 

Proof: From (1), + ft being a factor of P n and D, is also a factor of 

P w +i. From (2), aux + ft is a factor of P n +i . From Lemma (1) it follows 
that (ctiX + fix) 2 is a factor of P n +i. Continued repetition of the reasoning 
establishes the theorem. 

Corollary. If both a x x + ft and a%x + ft are factors of P n , then D q is a factor 
Of Pn+,-1 . 

Theorem h • Assume D of the same form as in Theorem h . If <x%x + ft , 
i * 1 or 2 , is a factor of P n +i and no higher power of a# + fii is such a factor then 
ctiX + fii is a factor of N + (k — n)D 

Proof: From (1), cax + fii being a factor of P n +i and of D is also a factor of 
either N + (k — n)D f or of P n . But oax + ft a factor of P n requires, from h , 
that (oax + fii) 2 be a factor of P n +i contrary to hypothesis. Thus, a# + ft is a 
factor of N + (ft — n)D f . 
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Corollary. If (an + ft)(a*x + ft), («i, aj s* 0), to a /actor o/ P*+.i and no 
AtgAer power o/ etoAer ai® + ft or aj® + ft is contained in P„+i <Aen AT + (A: — n) 
D' ■ 0. For from I t , N + (k — n)D' contains (an + ft) (a*® + ft) as a factor 
which implies N + (k — n)D', being linear, vanishes identically. 

Theorem h . If (an + ft)* and no higher power of an + ft is a factor of 
Pn+Q-i then an + ft and no higher power of an + ft to a factor of P„ . 

Proof: Let us write, 

(A) Pn+q-i — (an + ft) 5 <t>n-i , 4>n-\ = a polynomial of degree < n — 1 which 
does not contain the factor an + ft • Taking the (q — 1)“’ derivative of (A) 
by Leibnitz Theorem, we get, 

<B) P&. - g ( ? 7 ’) i <mx + «• ■ 

On setting q = q — 1 in (4) there results, 

(C) PiVA = H (n + q - 1 - 0 [tf' + 2fc ~ n -~^ + j -±j£>''jp.. 

From (B) we see that an + ft is a factor of P^+j-i • No higher power of 
an + ft is such a factor. From (C) our theorem now follows. 

Corollary (1). Under the hypotheses of Theorem h , an + ft to a /actor of 
N + (k — n + 1)D'. This follows at once from h and J 8 . 

Corollary (2). If Z>* = (an + ft) 4 (an + ft) 4 , («x, 02 ^ 0), to a factor of 
Pn+q-i and no higher powers of either an + ft or an + ft are factors, then N + 
(k — n + 1)0' = 0. For the linear expression N + (k — n + 1)0' contains, 
from Corollary (1), the quadratic factor (an + ft) (an + ft). 

The following lemma can be easily established and is given without proof. 
Lemma (2). Assume D of the same form as in Theorem It ■ Then there is only 
one value of s for which N + sD' contains an + ft as a factor. 

Theorem h . Assume D of the same form as in Theorem It. If N + (k — n)D’ 
contains an 4- ft, i = l or 2, as a factor, then P n +i contains an + ft and no 
higher power of an + ft as a factor. 

Proof: From (1) we see that P„+i contains an + ft at least to the first power 
as a factor. Again from (1), if P n +i contains a higher power of an + ft as a 
factor, this means that both P„ and P' n contain an + ft at least to the first 
power as a factor and from Lemma (1) it follows that P„ contains an + ft at 
least to the second power as a factor. By corollary (1) from Theorem L it 
follows that an + ft is a factor of N + (k — ni)D' for n\ < n, contrary to Lemma 
(2). 

Theorem It ■ If an + ft and an + ft are factors of N + (k — ni)D' and 
N + (k — nt)D' respectively, (ai, at 0), then P„ ■* 0, p > ni + nj. 

Proof: From Theorems h and 7s we see that («i® + ft) n * (an + ft) B ‘, of 
degree ni + n*, is a factor of ?«,+„,, of degree n* + nj at most. Similarly, 
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(aix + 0i) n,+1 (atx + h) ni+ \ of degree v* + n x + 2, is a factor of , of 

degree rtj + ni + 1 at most. This implies P„, + „ 1+ i = 0. Hence, P M ■ 0, 
n > rii + Tii. In fact, (1) shows that P„ = 0 implies P, m 0, v > p. 

Theorem h . Assume D of the same form as in Theorem U . Then P n 41 ■ 0, 
P n f£ 0, implies either N {k — m)D' 0, m < n, or there exist two values of 
m, (mi, nh), such that N + (k — mi)D', N + (k — mi)D' contain as factors 
aix + ft and atx + 0s respectively, (mi ,mt <n). 

Proof: Setting P„+i m 0 in (1) gives, 

(1°) [N + (k- n)D'] P n + DP' n m 0. 

If P n se const., 1° shows that N + (k — n)D' = 0 and our theorem is verified. 
Suppose P n ^ const. We get from (1°), 

_ [N + (k- n)D')P n 
* ~ D 

Thus, D is a factor of the numerator, and our theorem now follows from Corollar 
ries (1) and (2) of Theorem h . 

Theorem h • If N + (k — m)D' ^ 0, m = 1,2, • • • n, and if N + (k — m)D’ 
contains neither a,x + 0i, nor atx + 0* as factors, then P n+ 1 and D have no factors 
in common. This follows at once from Theorems h and h which constitute a 
necessary and sufficient condition that P„ and D have factors in common. 

Theorem 7#. If N ss const, and if D is linear, aU P n are constants, n = 1, 2, 3, 
• • • . This follows directly from (2). 

Theorem Iw . If N' + D" ^ 0, m = 1, 2, • • • (n — 1), all zeros of P„ 

JL 

which are not zeros of D are simple. 

Proof: Suppose P n has a multiple zero x = a which is not a zero of D. Then 
(1) shows that a: is a zero of P n4 i . From (2), a is a zero of P„+ 1. From 
Lemma (1), a is at least a double zero of P n +i. Furthermore, (3) shows that a 
being a double zero of P n and of P n +i is also a double zero of P n -i. By a con¬ 
tinued application of (3), it follows that a is a double zero of Pi which is impos¬ 
sible since Pi is of degree < 1. 


II. Concerning the Zeros of P n (k , x) 


The polynomials P n (&, x) are defined by Hildebrandt 3 as follows: P n (k y x) = 
1 d n 

i D n - k where y is a non-identically vanishing solution of the differential 

y dx n 

equation 


1 dy _ op + aix = N 
y dx ~ 6 0 + bix + fax? ~ D' 


* L.c. pp. 400-401. 
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The Jacobi Polynomials are defined as follows: 

j.(w*) - «, /s 

Teal. It follows that J n (x, a, 0) is a special type of P„(fc, x) with IV * (— 0—a) 
x + a, D m *(1 — x), n = k + 1, whence, 


N' = —P—a, D' - 1 - 2x, D" = -2; Z)(0) . D(l) = 0, 


P,(fc, x) ■ IV + JfcD' = 0 for 


a -f- ft 

= oT+T+ 2ft’ 


P((fc, x ) = -p-a- 2ft. 


In determining the number and location of the real zeros of the Jacobi Poly¬ 
nomials we employ the following notations: 


Pi(k, x) = 0 for x = ai, k .i, i = 1, 2, • • • k + 1; k = 0,1, 2, • • •; j = 1,2, • • • t. 

a i,k,i — <*»+!,ft,/ 


0 = AT' + = -0 - « - 2fc + n, n = 1, 2, •.. k, 

M = [N + (ft - n)Z)']_o = a + (ft - n), 

v - [IV + (ft - n) Z)']_,-0 - (ft - n). 

We proceed to determine the number of real zeros of the Jacobi Polynomials 
on the intervals (— «>, 0), (0, 1), (1, ») into which the zeros of D divide the 
x axis. 4 The proofs proceed by mathematical induction. We first determine 
the location of the real zeros of P„(fc, x), n = 1, 2, • • • k + 1, by successive 
applications of (1) and (2). We then use the relation P* +t (k, x) ss J k+1 (*, a, P). 

Several cases concerning possible values of a and p should be considered. In 
order to bring out the method of procedure only two such cases will be fully 
discussed here. The results for other possible cases will be merely listed. 

Ai : a < 0, P < 0, | a \ < \ P | ,a, P, a + P not integers. 

Let k x be the greatest integer contained in a, 

a i it tt n it n n a 

fa p, 

“ k t be the greatest integral value of k for which a + p + 2k <0. Then 


0 < fti < ft, < ft*. 


4 In the case a,0>O these zeros all lie, as is known, on (0, 1). 
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An : 0 < k < ki. We then have 6 > 0 , y < 0 , v > 0 , 0 < a M < 1, Pi > 0. 

(1)* + (_ i)* 

Then Jt+i(x, a, 0 ) has ——^—— zeros in 0, 1. These are the only real zeros. 

it 

Proof: Consider first Pi(k, x). Its only zero is at aj,*,i, where 0 < cn,*,i < 1 . 
Purthennore, Pi > 0. Also Pi > 0 for * > ai,*,i and < 0 for x < <*i,*,i. From 
( 1 ) we see that P*(fc, ai,k,i) > 0 , (since P x (k, ai,k,i) = 0 , D(oi,*,i) > 0 and P[ > 0). 
From (2) it follows that Pi{k, x) < 0 for x < ai.i.i, Pj(fc, ai,*,i) = 0 , Pj(fc, x) > 0 
for x > ai,*,i. These conclusions follow from remarks concerning the sign of 6, 
the fact that Pi{k, i) = 0, and from remarks concerning the sign of Pi to the 
left and to the right of x = «i,*,i. Thus, P s (fc, x) > 0 for all real x and hence 
has no real zeros. By employing (2), it is now evident that P»(k, x) > 0. From 
(1) and remarks concerning y and v we see that P g (fc, 0 ) < 0 and Pt(k, 1) > 0. 
Thus Pi(k, x) has a single real zero a*,*,i, 0 < a g ,*,i < 1. The reasoning from 
Pi to Pi is analogous to that from Pi to P*. By continuing this procedure we 
finally conclude that Pk+i(k, x), (= Jk+i (x, a, 0 ), has but one real zero, (in 0,1), 
if k is even and no real zeros if A: is odd. 


An: ki < k < k 3 . Set k = ki + q, q = 1 , 2, • • • , kt — ki. Here 0 > 0, 
y > 0, n = 1,2, • • • q - 1, y < 0,« = q, q + 1, • • • , q + k x . v > 0, ai,*,i < 0, 
P[(k, x) > 0 . «/*, + q + 1 (x, a, 0 ) has q distinct zeros in (— », 0) and 

( l)* 1 + 

- -- - zeros in 0,1. These are the only real zeros. 


Proof: First consider the sequence P n (k, x) n = 1 , 2 , • • • q, since the conditions 
on 0, ix, and v do not change over this range of n. Now P\{k, oi,*,i) = 0 , ai,*,i < 
0 . Furthermore since Pi > 0 we have Pi > 0 for x > ai,*,i and < 0 for x < 
ai,*,!. Pass now to Pt(k, x). Since D(ai,*,i) < 0 and P[ ( k , ai,*,i) > 0 , we see 
from (1) that P 2 (fc, ai,t,i) < 0. Moreover (2) shows P[ (k, <*i,*,i) = 0, P’ t (k, x) 
< 0 for x < ai.t.i and > 0 for x > <*i,*,i. Thus P 2 (fc, x) < 0 and a relative 
minimum at x = <*i,*,i. Since j P*(fc, ± ») | = «, we see that P*(k, x) has two 
real zeros of which the left most, a 2 ,i,i, is in (— », 0 ). Again y > 0 together 
with ( 1 ) assures Pt(k, 0) > 0 . Thus a 2 ,t,* is in (ai.i.i, 0 ), hence in (— «, 0). 
By continuing this reasoning on the successive P*(fc, x), ra = 1 , 2 , • • • q, we 
conclude that P,(fc, x) has q zeros in — °°, 0 and P' Q (k, <*«,i,0 < 0 . 

Next, consider the sequence P n (k, x), n — q + 1 , q + 2 , • • • q + fci + 1. 
Over this range of n we have 6 > 0 , y < 0 , v > 0. From what has just been 
shown, P 4 (fc, £*„,*,<) = 0 , — eo < < 0 , i = 1, 2 , • • • q. Also P 4 (k, a q ,k.i), 

i — 1, 2, • • • q, is alternately negative and positive. Suppose q odd, (similar 
reasoning holds for q even). Thus, we suppose P' q (k, <* 9 .i,i) < 0, P' Q (k, a q ,k, q ) < 
0 , P t (k, x) > 0 for x < a q ,k, i and < 0 for x > . ( 1 ) shows P q +i(k, a,.*.,), 

i — 1, 2, • • • g, to be alternately positive and negative. Thus, the zeros a 9 ,*,i 
are separated by g — 1 zeros of P q +i(k, x). Since from (1), P q +i(k, a„,k. i) > 0 
and from (2) P q +i(jk, x) > 0 for x < <*,,*,i, there exists a zero a 4 + i,*.i in (— «, 
a q ,k,i). Thus far, we have established the existence of g zeros of P,+i(fc, x) in 
(— oo, 0 ). g being odd, we have from ( 1 ), P q +i(k, > 0 . Also from ( 2 ), 
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P 4 + i(k, x) < 0 for * > a,,*.,. Again from ( 1 ) and assumptions regarding p and 
v it follows that P«+i(k, 0 ) > 0 , Pq+i(k, 1 ) < 0 . Thus, P q +i(k, x) has a zero 
(0,1). There being no extrema for Pq+i(k, x) other than the <*,.*.<, 
i = 1, 2, • • • q, (as (2) shows), we have thus proved that P&.i(k, x) has q 
distinct zeros in (— *>, 0) and a single zero in (0,1). Reasoning similarly from 
P q +\(k, x) to P q +i(k, x) we establish the existence of q distinct zeros a 4 + i,t,<, 
* = 1 , 2 , • • • q, in (- », 0 ) with a 4 +St k,i in (- », a 4 + i, t .i) and a 4 +J ,*,i, i « 
2, 3, • • • q, separating , t = 1, 2, • • • q. From ( 1 ) we see that Pq+t(k, 

etq+i,k,q) < 0 and P 4 + s(fc, a 4 +i,i, B +i) < 0. The only extrema of Pq+*(k, x), 
(as (2) shows), are located at , i — 1, 2, • • • q + 1. Again, by (2), 

Pq+*(k, x) < 0 for x > ttg+i.jt.j+i ; hence there can be no real zeros of P 4+J except 
the q zeros in (— w, 0) already found. The reasoning from P,+t to Pq+t is 
similar to that from P 4 to Pq+i . Thus, P 4+ *,+i = •/*,+„+! has q distinct zeros in 
(— oo, 0 ), together with one zero in ( 0 , 1 ) for k x even. For fcx odd, there are q 
distinct zeros in (— », 0 ) only. The results are the same whether q is odd or 
even. 

The results for the remaining sub-cases under case Ai are given in the table 
which follows. For completeness, the results for cases A u and A u are included 
in the tabulation. A few words of explanation are necessary to clarify the 
conditions under which the various sub-cases in the table occur. Let | a ] = 
&x + q, | 0 | = fa + h, h, q < 1. If q + h < 1, then | a + /3 | = fci + kq and we 
have either, 

A mi •* h + k» even, 2kz = ki + kt a* k» — ki = kt — k». 

Am : ki + kt odd, 2k» = ki + fa — 1 * k» — kx =s kt — k 3 — 1 . 

Again if 1 < q + h < 2, then |a + d| = fci + fcs + l and we have either, 

Am : k t + kt + 1 even, 2kt = ki + kt + 1 = fcj — fci = kt — fcj + 1 . 

Am : ki + kt + 1 odd, 2kt — ki + kt = k 3 — ki = kt — kt 
In cases Ai« and Am we assume \ a-\-p\ = k 1 + kt + p,p<l, while in cases 
Am and A«* ,|a + ^| = ii + fe + p, 1 < p < 2. The complete results for 
case Ax follow. (See page 213.) 

A* : a < 0,0 < 0, | a | < | P |, a, 0 not integers, a + 0 = integer. Define ki , 
kt, kt as in Ax. Then 0 < ki < kt < kt. In Case Aj t , <3 + a is odd while in 
Case A«, 0 + a is even. (See page 214.) 

A*:a<0, d< 0 , a = —ki, integer, fi not an integer, \ a | < | j3 |. Define 
ki, kt, k» as in Ax. Then 0 < ki < k 3 < kt. There are two sub-cases, An: the 
greatest integral value of a + 0 is odd, A» : this integral value is even. (See 
page 215.) 

Aq : a < 0, d < 0, a not an integer, 0 = —k x , integer, \ a \ < 1 0 1. Define 
h , kt , h as in Ai. Then 0 < k x < k t < kt. There are two sub-cases, A« : 
the integral part of a + 0 is odd, Am : this integral value is even. (See page 216). 
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Jkx+ki+q+i; q * 2, 3, • • •; Same zeros as in Aui for corresponding values of q. 
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A*:a<0, 0 < 0, | a | < | 0 |, a = — fc x integer, 0 = —h integer. Define 


ki, fc,, ki as in Ai. 

Cases Polynomial 

In cases Aw and A« , a + 0 is odd and even respectively. 

Range of Sub-Script Zeros in 



(- *, 0 ) 

a? - 0 

( 0 , 1 ) 

(l)* + (- 1 )* 

2 1 

A 511 , A^i; 

Jk+lt 

0 < k < fci; 0 ; 

0 ; 

Ash, Atw; 


q - 0, 1 , 2 , •••,*» - fci; q; 

fci + 1 ; 

0 

Asia; 

J kt+q+11 

©< 

1 

-vT 

1 

«o 

I 

l 

e* 

C<f 

r—i 

H 

fci +1; 

0 

A 828$ 

A 514, A 524,' 

Aju, A 525 ; 

J * j'Hr+i i 

J * 1 + 0+1 38 0j 

Jki+ki+q+l 58 0j 

q * 1, 2, • •, fc, - fc, - 1; fc, - fci - g + 1; 

S' “ 0,1, 2, • • • fci. 

q ~ 1, 2, 3, • • • 

; fci + i; 

0 


If assumptions are identical with those of A 5 except | a | = | 0 | , then for 
0 < A; < ki, the results agree with Mn and J^+g+i s 0, q = 0, 1, 2, • . 

A« : a > 0, 0 < 0, | a | > | 0 | , 0 not an integer. Let fci be the largest integer 
in 0. 

Case Polynomial Range of Sub-Script 

(0, 1) 

A«i J jt+i 0 fC k <C k\ 0 

Asa Jkt+q+i q — 1, 2, 3, • • • q 


Zeros in 

(i, *>) 

(l)* + (-l)‘ 
2 

(l )* 1 + (-l)“ 
2 


A 7 : Same assumptions as in A« except 0 = — k t , integer. 


Case 

Polynomial 

Range of Sub-Script 

(0, 1) 

Zeros in 

X = 1 

A 71 

Jk+l 

0 < k < Iti — 1 

0 

0 ( - 

A 7 , 

J k X +q+]. 

q = 0,1, 2, • • • 

q 

ki + 1 


Ag:a>0, 0<O, |a| = |0|. «7j = a and results for J n , n > 1 are 
identical with those in A? and A. respectively according as 0 is or is not an integer. 
A» : a > 0, 0 < 0, | a | < | 0 | ; 0, a + 0, not integers. 

Let ki be the greatest integer in a + 0. 


it 

U 



u 


it 

u 


a 

tt 


“ fi. 

for which a + 0 + 2k < 0. 


u a 



218 


FRANK S. BEALE 


Then 0 < kt < h < k %. 


Case 

Polynomial 

Banco of Sub-Script 


Zeros in 




(-*,0) 

(0, 1) 

(i> °°) 

An; 

*+i; 

0 £ k<kf, 

* + i; 

0; 

0 

Am; 


g -1,2, •••,*»; fci even; 

k» — q +1; 

0; 

0 

Am; 


q- 1,2, •••, (Jba + 1); *i odd; 

k$ — q + 2; 

0; 

i 

Am; 


q - 1,2, •••, (** - *i); 

0; 

0; 

(l)*i+<r + (_l)*i+< 
2 

Am; 


q - 1,2, 3, •••; 

0; 

r, 

(1)** + (-1)*’ 

2 


Am : Same assumptions as in At but now \ a | = | 0 |. Then ki — k t = 0, 
Jx = a, and results for J n ,n> 1 are the same as in Am and An . 

Au : Same assumptions as in At except (i = — kt, integer. 

Case Polynomial Range of Sub-Script Zeros in 

(— oo,0) (0,1) X = 1 (1, ®) 

Au.i Same as A« 

Au.i Same as A« 

An, 3 Same as Am 

Au.i Jkt+t+ii g = l,2,3, •••; 0; q; kt + 1; 0 

An : a > 0, j8 < 0, | a | < | 0 | , p not an integer, a + I9 — odd integer. 
Define k\, kt, k a as in A t . 

Au : Same assumptions as in A a except a + ft = even integer. 


Cases 

Polynomial 

Range of Sub-Script 

Zeros in 

Ai2,i, Axa.i; 

Same as A»i 


(-00, 0) 

Ai2,2; | 

[ju,+» - const. > 0; 

Q = 1,2, • •- , k»; 

kt — q + 1 

Am,2 ; | 

A12,8 t Ais.s; 

A12,4 f Ai8,4 ; 

| «/*,+8+lj 

[Jik,+) = const. > 0; 

Same as A» s 

Same as A M 

II 

JO 

V 

+ 

h - q + 2 
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A14: Same assumptions as in Aw, except 0 = — k% integer. Cases Ai 4 # l , 
Ai 4 ,s and A**,* have the same results as Ai*,i, Aw, 2 , and Aw,» respectively. 
Ai 4,4 has the same results as An , 4 . 

A« : Same assumptions as An except 0 = —fcj, integer. Cases A«,i, A«,j, 
and Au,» have the same results as Au,i, An, 2 , and An,a respectively. An, 4 has 
the same results as Aim • 

An : a as 0, fi < 0, 0 — not an integer. 

Let ki be the largest integer contained in 0 . 

“ be the largest integer for which ft + 2k < Q. 


Cue 

Polynomial 

Range of Sub-Script 

Zeros in 





(-*, 0 ) z -0 ( 0 , 1 ) 

( 1 , 00 ) 

Aie.i; 

Jk+ 1 ; 

0 < k < k z \ 

k; 1 ; 0 ; 

0 

A16.2; 

J * 1 + 9 + 1 » 

q » 1 , 2 , ••• , A* - kz; t 

h-q; 1 ; 0 ; 

1-9 + 1; i; 0; 

0 ; k\ even 

1; ki odd 

Aie.s; 

J *1+9+1) 

Q “ 1 , 2 , 3 , 

0; 1; 9-I; 

a)** + (-1)*' 
2 

A 17 

: a = 0, p 

= — fci — odd integer. 

Define fc 3 as in Aw . 


Ais 

: a = 0, P 

= — fcj — even integer. 

Define k» as in Aw . 



Cases 

Ai7,i, Aim; 

Polynomial 

Same as A^.i 

Range of Sub-Script 

(- », 0 ) 

Zeros in 

*-0 ( 0 , 1 ) 

H 

II 

1 —* 

A17.2; 

J *a+9+l i 

9 - 1 , 2 , ••• ,*1 - 1 ; 

fcs- 9 ; 

i; 

o; 

0 

Al 8 , 2 » 

Ai 7 , 8 , Ai 8 ,s; 

1 **+9+1 J 
fJki+i m 0 

9 “ 1 > 2 , • • •, h% + 1 ; 

kt 9 + 1; 

i; 

0 ; 

0 

Aw : ot 

l/*i+«+i; 

= 0 , 0 = 0 . 

9 - 1,2,3, •••; 

J, m 0. 

0 ; 

i; 

9-1; 

k\ + 1 


Jic+i has k — 1 zeros in (0, 1), 1 zero at x =0, 1 zero at x = 1, k — 1, 2, 3, 

From the definition of J n (x, a, /}) it is readily seen that J n (x, a, $) = (—l) n 
J n (l — x, 0, a). Thus, a transformation of x to 1 — x interchanges a and fi. 
The interval (— », 0) is transformed into (1, ») and vice-versa. The points 
x = 0 and x = 1 are interchanged. Consequently, in all previous results we 
may interchange properly a and 0. 

In the foregoing results, the only real multiple zeros that can occur are at 
either x = 0 or x = 1. In the process of determining the degree of multiplicity 
of such zeros use was made of Theorem h • 

Points of Inflexion. By taking (4), setting k = n, and replacing N' and D" 





220 


FRANK 8. BEALE 


by their values for Jacobi polynomials, we get: P"+i(n, *)==(»+ 1) (») 
[0 + a-fn][/3 + a + n + l] P B _i(n, x). From definitions of P B (4, x) and 
J n (x, a, 0) we easily verify that, 

P*(n ± q, x) ss J n (x, a ± q + 1, 0 ± q + 1), whence, 

J"{x, «, 0) - (n + 1) (n) [/S + « + »] [0 + « + n + 1] J B _ X (x, a + 2, 0 + 2). 

We conclude that if neither a + 0 + n nor a -+- 0 + n + 1 vanishes, the points 
of inflexion of J n +i(x, a, 0) are at the zeros of odd order of J n -i(x, a •+■ 2 ,0 + 2). 

The Degree of J n (x, a, 0). In analyzing the results of cases Ai to Aw inclusive, 
it is noted that in some cases the number of real zeros of J* is less than n. The 
question naturally arises whether the degree of J„ is n or less, for then we can 
determine the number of its imaginary zeros. The explicit expression of 
J n {x, a, 0) is known from which the degree of J n can be found for various a and 
0. However, the degree of J n can be found from (4). 

Since a, 0) = P n ~i(n, x), let us replace k by n in (4) and at the same 

time replace N' and D" by their values for Jacobi Polynomials. Thus, we get: 

</n+i(x, a, 0) = n (n + 1 — i)[-0 - a - n - t]P B _ 9+l (n, x), 

(5) 

n = 0,1, 2, • • • ; q = 0,1, • • •, (n + 1). 


We may establish the following results. 

CJ If a + 0 is not an integer, the degree of J B+ i (x, a, 0) is n + 1, n = 0, 

1 , 2 , ... . 

In fact, in order for J„+i to vanish, we see from (5) that either some factor 
— 0 — a — n — i vanishes or P B _, + i(n, x) vanishes identically. We first show 
that the latter is not possible. Now Pi(n, x) = N + nD' = (— 0 — a — 2n) 
i + o + n^fl since 0 + a is not an integer. Consequently, if P„(n, x) = 0, 
M>0, /i<n+l there will be a first value of n, (n = v), for which P»(n, x) = 0 
but P,-i(w, x) 0. By virtue of Theorem h this means that either N + 
(n — p)D'm [— 0 — ot — 2 (n — p)] x + a + n — p se 0, p < v, or else there 
exist two values of p, (pi, ps), such that [— 0 — a — 2(n — pi)] x + a + n — pi 
and [— 0 — a — 2(n — ps)] x + a + n — pj are divisible by x and 1 — x 
respectively, pi, ps < v — 1, Pi f 4 Pi ■ Since, however, a + 0 is not an integer 
we see that, [— 0 — a — 2 (n — p)]x + a + n — p^0, n and p being integers. 
This eliminates the first possibility that P„(n, x) * 0, p < » + 1. Again, if, 
[— 0 — a — 2(n — pi)] x + a + n — pi is divisible by x, we have a + n — pi = 
0 or a an integer. For (a + n — pj) — [/3 + a + 2(n — p*)] x ■« (a + n — p») 

j^l — ~ xj to be divisible by 1 — x requires 0 + n — 

Pi = 0 or 0, an integer, a and 0 are therefore both integers contrary to hypoth¬ 
esis! Thus, in (5), no polynomial P n _ 5+ i(fc, x) m 0 and J B+ i(x, a, 0) ^ 0. 
Replacing q by n + 1 in (5) leads to, 
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(*, a, 0) *= II (n + 1 - t) [-0 - a - n - *] P 0 (», x), 

( 6 ) <-« 

n = 0, 1,2, • •.. 

Thus Ji+t 1 * ^ 0, (since Po(n, x) = 1 and no factor — 0 — a — n — i can vanish) 
and the degree of J n +i is precisely n + 1. From similar reasoning we prove: 
Ci) If a + 0 > 0 the degree of J n +i is n + 1, n = 0, 1,2, • • • . 

Cj) If a + 0 * 0, then (I) Ji = a and (II) J„ +l is of degree n + 1, n — 1, 

2,3,... 

C«) If a + 0 = —M — integer, M > 0, 0, a not integers, then, 

(I) For n < M, the degree of J„ + i is min. (n + 1, M — n). 

(II) n = M, Jn +1 = const. 

(Ill) n > M, the degree of J„ + i is n + 1. 

CO If a + 0 = — M — integer, M > 0, a, 0 integers, a > 0, 0 < 0, then, 

(I) For n < M, the degree of J n+ , is min. (n + 1, Af — n). 

(II) n — M, Jn+i m const. 

(Ill) n > M, the degree of J n +i is n + 1. 

CO If a + 0 = — M — integer, M > 0, a - — ^-integer, 0 = — fcj-integer, 
ki < ki then, 

(I) For n < h , Jnn is of degree n + 1. 

(II) 71 ^ k% , J n+i S 0. 

C7) If a + 0 = — M — integer, M > 0, a = 0 = — Jfci-integer, then, 

(I) For n < ki, J„+i is of degree n + 1, 

(II) n > ki , J n+1 = 0. 

The Laguerre Polynomials. These are defined as follows: 


L n m L„ (x, a) = ~ [e~ z x n+a ~ l ], n = 0,1,2,...; 


a — real. We see that L„ is a special case of P»(fc, x) with N s — x + a, 
D * x, n = k + 1 . It follows that 6 = — 1 , = a + — n, aui = a + k, 

and P[(k, x) = 1. These can be Used in determining the location of the real 
zeros of L „, as was done for J n . The discussion here is somewhat simplified 
since L n has but one parameter, a, and the x-axis is divided by the zeros of D(x) 
into two segments only, namely, (— «, 0) and (0, »). 

The following results are easily obtained. 

Bi: a > 0, L n (x, a) has n distinct zeros in (0, «), n =* 1, 2, 3, • • • . This 
result is well known. 

B* : a = 0. L» +1 (x, a) has n distinct zeros in (0, ») and a simple zero at x = 0, 

w = 0, 1, 2, • • • . 

B*: a < 0 , a, not an integer. Let k x be the largest integer contained in a. 

(I) I»i+i(x, a) has ^ zeros in (— 0), 0 < k < k t , 
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(II) Lt l+9+ i(x, a) has q distinct zeros in (0, ®) and 


(1)‘‘ + (~l) tl 


zeros in 


(- oo,0),g = 0,1,2, ••• . 

B*: a < 0, a = —fa — integer. 

(I) Lk+iix, a) has -11 zeros in (— «, 0), 0 < k < hi . 


(II) Lt 1+t+ i(z, a) has q distinct zeros in (0, *>) and a zero of order fa -f 1 at 
* xx 0, q = 0,1, 2, • • • . 

The Degree of L n (x, a). We show first that here Pn(n, x) yi 0, m = 1, 2, • • • 
n + 1. By definition, Pj(n, *) * N + nD' m —x + a + nfi 0. Let us 
rewrite (2) for our present situation thus: 

(2°) P'„(n, x) = —nP„-i(n, x). If, now, P„(n, x) « 0, then from (2°) it follows 
that P„-i(n, x) as 0. Continuing this reasoning, we finally arrive at a contra¬ 
diction, namely, Pi(», x) as 0. If in (4) we set q — n -f 1 and replace N' and D" 
by their values we get: 


LiWHx, a) = (—l) B+1 (n + 1)! Po(n, x) = (-l)" +1 (n + 1)1 


Hence, L n+1 is of degree n + 1. Note that this holds regardless of the value of 
a contrary to what was found for Jacobi Polynomials. 

Points of Inflexion. By a procedure analogous to that used for Jacobi Poly¬ 
nomials we can show that the points of inflexion of L»+i(i, a) are located at the 
zeros of odd order of L n -i(x, a + 2). 

The Polynomials P n (0, a:). If we set k = 0 in (1), (2), and (3) we obtain the 
following relationships for P n (0, x) 6 = P n (x) as P„ . 

(7) Pn+iOc) . [N - nD'] P n (x) + DP’ n (x). 

(8) PU,(*) - (n + 1) [N' - |D"] P„(x). 

(9) P^x) - [N - nD'] P n (x) + n(N'~ D'^j DP B _.(x). 

Theorems h to Jjo inclusive, with k — 0, hold for P n (x). In addition, the 
following theorems hold for P„ . 

Theorem Hi. Suppose N linear and D(x) > 0 for all x. Furthermore, let 

Tt% 

N' — — D" < 0, m — 1, 2, 3, • • • . Then P n has n real, distinct zeros which 
Jh 

separate the zeros of P»+i. 

Proof: Denote the zeros of P» by a B ,<, * = 1,2, • • • n, a Bf < < a B> ,+i. Suppose 
N' > 0. N being linear has a single zero an . Furthermore, since Pi = Ni, 
then Pi < 0 for x < a u and > 0 for x > an . We pass now to P*. From (7), 
we see that P*(an) > 0, (since D > 0 and P[ > 0). Also (8) shows P»(x) > 0 


1 E. H. Hildebrandt, loc. cit. pp. 369. 
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for x < an and < 0 for x > au • This follows from what was noted concerning 

the sign of Pi for x > an and x < an , together with the hypothesis that N' — — 

2 

D" < 0 . Thus, there exists a zero of P t in (— *>, a u ) and a zero in (« u , «) 
and our theorem holds for n = 1 . Assume that the theorem is true for » = h. 
The sequence Ph(ah.i), i = 1, 2 , • • • h, is alternately positive and negative. 
Since, from (8), the only extrema of Pk+i are at a*,<, i = 1,2, • • • h, we conclude 
that there are h — 1 zeros of Ph+i separating the a*,,-, i = 1, 2, • • • h. Since 
Pa(<*m) > 0 we conclude that Ph < 0 for x < an.i . This fact, combined with 
(8), shows Pk+i(x) > 0 for x < an, i. Ph+i(ah,i) being positive, it follows that 
there exists a zero of P h +i in (— », ah, i). Similar reasoning eftablishes the 
existence of a zero of P*+i in (a*,*, °o). Our theorem is thus established for 
N' > 0. The case N' < 0 can be similarly treated. 

Theorem H t : If D(x) > 0 for aU x, D" < 0, N' - J D" < 0 , N' = 0 , N ^ 0 , 

then P„ , n = 2 , 3 , • • • , has n — 1 real, distinct zeros which are separated by the 
zeros of P„_i. 

Proof: Since P, = N = const., we see from ( 7 ) that Pt is linear. The reason¬ 
ing of Theorem Hi applies where we now start with P t . 

Theorem H t : If D(x) > 0 for all x, except x = 0, where D has a double zero and 

if N' 0, N' — ^ D" < 0, n = 1, 2, 3 , • • • , then P„ has n real, distinct zeros 

which separate those of P„ +J . 

Proof: Theorem h with k = 0 assures us that P„ and D have no zeros in 
common. The proof now follows the line of reasoning of Theorem Hi. 

Theorem Hi: If D(x) > 0 for all x except x = 0 where D has a double zero and 

if N’ = 0, N & 0, N' - 5 D" < 0 , m = 1, 2 , 3 , • • • , then P„ has n - 1 real, 

distinct zeros which separate those of P n+ i ,n — 1 , 2 , 3 , • • • . This theorem follows 
from Ht as did Hz from Hi. 

Points of Inflexion. Setting k = 0 in ( 4 ) leads to, 

P'«' + i = (n+1) (n) [*' - n - Z>"] [tf' - J D"] P n _,. 

This shows, under the assumptions of Theorems Hi to Hi inclusive, that the 
points of inflexion of P n +i are at the zeros of P n -i ■ 

Hermite. Polynomials. Theorem Hi and statement immediately above con¬ 
cerning points of inflexion apply directly to Hermite Polynomials where N = —x 
and D * <x s . 


Lehigh University. 



THE SIMULTANEOUS COMPUTATION OF GROUPS OF REGRESSION 
EQUATIONS AND ASSOCIATED MULTIPLE CORRELATION 

COEFFICIENTS 

By Paul S. Dwyer 

1. Introduction. The need sometimes arises for the prediction of a number of 
different variables from a given group of so-called fundamental variables. In 
the work of college prediction, for example, one might desire regression equations 
predicting certain measures of college achievement (e.g., first semester average, 
first semester English grade, first semester mathematics grade, number of hours 
of A received during first semester, etc.) on the basis of a number of other factors 
(e.g., high school record, score on American Council on Education Psychological 
Examination, score on some standard English achievement test, score on some 
standard mathematics achievement test, etc.). It is the purpose of this paper 
to show how the regression coefficients and the associated multiple correlation 
coefficients can be obtained simultaneously. The essence of the method is a 
simple device by which one solution of general normal equations may be made to 
serve for all cases. 

2. The normal equations. Let xi, xt , x », • • • x„, be the so-called funda¬ 
mental variables and let x* be the predicted variable. The normal equations 
are computed by standard methods which result in one of the three types. 

Type I. Normal equations for determining 6®, &i, b*, b», • • • , b„ . 


bo/n -f- bj2xi -f - bg2x 2 4- bj2xs 4".4" b B 2x B — 2x* = 0 

bo£xi 4* bj2x* 4" bjZxiXt 4- b®2x jXj 4-.4- b„2xix B — 2xix* — 0 

bo2x: 4- 6i2xiXj 4" b*2x| 4" b»2xaXs 4-.4- b B 2xjX B — 2x*x* = 0 

boXXn 4" biXlnXl 4" bt 2x n Xj -|- bj2x n X3 -)■.4" b„2x B — 2x «Xk = 0 

Tire II- Normal equations for determining &i, bt , bj, • • • , b„ . 

Si = Xi — M Xi 

bi2x* 4" b»2xiXt 4" b»2xix$ 4-.4- b n XSiS n — XSiSt — 0 

bi2xsXi 4“ b*2f * -f- b|2x»xj 4-.4" b„2xjx n — 2xjXt — 0 

bi2fnXj 4- btXSnSt 4- b®2x n xj 4-.4~ b B 2x B — 2£ n £* = 0 
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Type III. Normal equations for determining ft , ft , ft , • • • , ft . 

ft + fiaft + r 18 ft +.+ Tin(} n — ru = 0 

f2lft + ft + ^23^3 +.+ f2nft — T%k = 0 

fnlft + r„aft + 7*naft +.+ r n nPn —• Tnifc = 0 

The three types are special cases of the general 

dnyi + di 2^2 + duyi +.+ difljj +.+ d\ n y n — du = 0 

diiyi + ^22^2 + ftsl/s +.+ ft Mi +.+ ftn^n — da* = 0 

dziyi + ds2t/2 + daa^/s +.+ da,#,- +.+ d 8n l/n — da* = 0 

dti^i + day 2 + diat/3 +.+ d t; y ; * +.+ d, n y» — d** = 0 


dnlj/l + dn 2|/2 + d n 3?/3 +.+ d n / 2 /; +.+ d nn y n — d n * = 0 

where y, are the regression coefficients and d„ = d/»-. 

The methods described in this paper are applicable to the general case and 
hence to each of the three particular types. 

In examining the normal equations, it is noticed that the first n terms of each 
equation are completely determined by the n fundamental variables. The 
equations, aside from the last terms, are identical no matter what variable is 
predicted. It is only necessary to devise a technique for separating the con¬ 
tributions of the da terms. 

3. Solution by determinants. One method utilizes determinants. The 
value yj is expressed in terms of a determinant involving a column with entries 
dik, d 2 k, du, • • • , dnk • The determinant is expanded in terms of this column. 

Specifically, let D be the determinant of the coefficients of the y, and let A, 
be the cofactor of any element d t7 of D. Then 

D = Z iDijdv 

»—1 

and 

yi = ^ (Du dtt + A1 d 2 k + Ai da* +.... + Ai dy* +. . . . + D n i d n *.) 

y% = (A 2 d\k + Dn d 2k + Ai ft* +.... + D& ft* +....+ D n2 dr *.) 

y% = ^ (Du dvc + A» du + Da da* + ....+ Da dy* +.... + D n % d»*.) 

y n = (Din dik + An du + An da* + • . • + An dy* + • . ■ . + Dnn dnk •) 
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It is only necessary to compute ^ to find the coefficient of dj »in the expansion 
of y t . 

An illustration is given. The normal equations are 

pi + .3300 Pt -j- .2100 Pt — Tik — 0 
.3300 Pi + Pt — .4800 Pt — r» = 0 
.2100 Pi — .4800 Pt -f" Pt — Tik — 0 

from which at once 

Pi - 4 (.7696 r u - .4308 r* - .3684 r») 

— .4308 rut H - .9559 r** -J- .5493 ru) 

— .3684 + .5493 r ik + .8911 r,*) 

and also 

D = .550072 = (1.00)(.7696) + (.33)(-.4308) + (.21)(-.3684) 

= (.33) (—.4308) + ( 1.00)(.9559) + (-.48)(.5493) 

= (.21) (—.3684) + (-.48)05493) + ( 1.00)(.8911) 

so that 

Pi = 1.3991 Tik — .7832 ru — .6697 ru. 

Pi = -.7832 ru + 1.7378 r, k + .9986 r« . 

p» = -.6697 r u + .9986 r» + 1.6200 r » k . 

It is only necessary to insert any given values ru , r s * , r»* , to obtain the coeffi¬ 
cients of any specific regression equation. 

4. Solutions without determinants. Theoretically the solution by deter¬ 
minants is excellent but as the number of variables increases the work of com¬ 
puting the n cofactors £or the n ( n + ^ different cofactorsJ becomes enormous. 

We desire a technique for separating the contributions of the last terms when 
determinants are not used. This can be accomplished by using a separate 
column for each da . Before algebraic manipulation, the value da is factored 
from the column and, after manipulative solution is complete, the multiplication 
by dik is carried out. 


u 




p* 


5 < 
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As an example consider the normal equations 

Pi + fi202 — rut = 0 
rifai + Pt — Tik = 0 

where = r w = .3300. Then the normal equations may be represented by 
rows (1) and (2) of Table I. 


TABLE I 


Row 

Operation 

ft 

ft 

Tlk 


(i) 


1.0000 

.3300 

-1.0000 


(2) 


.3300 

1.0000 


-1.0000 

(3) 

— .3300 times (2) 

- .1089 

- .3300 


.3300 

(4) 

(1) + (3) 

.8911 


-1.0000 

.3300 

(5) 

— (4) divided by .8911 

-1.0000 


1.1222 

- .3703 

(6) 

— .3300 times (5) 

.3300 


- .3703 

.1222 

(7) 

- (2) + (6) 


-1.0000 

- .3703 

1.1222 


The four decimal place solution, whose steps are indicated by (3) (4) (5) (6)(7), 
is from (5) and (7) 

Pi = 1.1222 T\k — .3703 7*2jk 
fa = -.3703 n* + 1.1222 r** 

This device may be combined with most of the standard methods of solving 
normal equations. 

5. Combination with Doolittle method. Especially to be recommended is a 
combination of this device with the Doolittle method which is recognized as a 
most efficient method of solving normal equations in from five to ten Variables 
[1] [2]. One of the advantages of the Doolittle method is that related multiple 
regression coefficients may be obtained from the same forward solution, though 
additional back solutions are necessary [3]. 

The problem which led to the development of this technique was the simul¬ 
taneous prediction of scores on various occupations covered by the Strong 
Vocational Interest Blank from the scores on a few fundamental occupations. 
A multiple factor analysis revealed that five basic factors account for most of the 
scores. Five occupational scores, serving as approximations to the five basic 
factors, were used as the fundamental variables and the other scores were 
predicted from them. 

As an illustration of this prediction technique combined with the Doolittle 
method, I have selected three test scores as fundamental since the solution based 
on them shows all the steps of the Doolittle method and is shorter than the five 
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variable problem. Actually, solution by determinants (section 3) is advised 
for problems involving three variables. The steps of the Doolittle solution are 
presented in Table EE. The results should be compared with those of the 
determinant solution of section 3. 

The first column indicates the row and the second the description of the 
algebraic operation. The next three columns are the standard columns of a 
Doolittle presentation with the conventional elimination of the lower left entries. 
The next three columns carry through the Doolittle method with the values 
Tu,, ru, ru kept in separate columns. The last column is an adaptation of the 
conventional summary check column of the Doolittle solution. 


TABLE II 

Generalized Doolittle Presentation 


Row 

Operation 

fix 

fit 

fit 

*•1* 

r*k 

r*k 

S 

(i) 


1.0000 

.3300 

.2100 

-1.0000 



.6400 

(2) 


.3300 

1.0000 

-.4800 


-1.0000 


-.1600 

(3) 


.2100 

-.4800 

1.0000 



-1.0000 

-.2700 

(4) 

Repeat (1) 

1.0000 

.3300 

.2100 

-1.0000 



.6400 

(6) 

Negative of (4) 

-1.0000 

-.3300 

-.2100 

1.0000 



-.6400 

(6) 

Repeat (2) 


1.0000 

-.4800 


-1.0000 


-.1600 

(7) 

— .3300 times (4) 


-.1089 

-.0693 

.3300 



-.1782 

(8) 

(6) + (7) 


.8911 

-.6493 

.3300 

-1.0000 


-.3282 

(9) 

— (8) divided by 


-1.0000 

.6164 

-.3703 

1.1222 


.3683 


.8911 








(10) 

Repeat (3) 



1.0000 



-1.0000 

-.27 

(ID 

— .2100 times (4) 



-.0441 

.2100 



-.1134 

(12) 

.6164 times (8) 



-.3386 

.2034 

-.6164 


-.2023 

(13) 

(10) + (11) + (12) 



.6173 

.4134 

- .6164 

-1.0000 

-.6867 

(14) 

— (13) divided by 



-1.0000 

-.6697 

.9985 

1.6200 

.9488 


.6173 








(15) 

.6164 times (14) 



-.6164 

-.4128 

.6166 

.9986 

.6848 

(16) 

(9) + (15) 




- .7831 

1.7377 

.9986 

.9531 

(17) 

— .2100 times (14) 



.2100 

.1406 

-. m7 

-.840* 

-.1992 

(18) 

- .3000 times (16) 


.3300 


.2584 

- .6734 

-.8*96 

-.3146 

(19) 

(6) + (17) + (18) 

- 1.0000 



1.3990 

-.7831 

-.6697 

-1.0637 


The general solution is read from rows (19) (16) (14) and is 
ft = 1.3990 rue — .7831 Ttk — .6697 ru . 

ft = -.7831 r lk + 1.7377 r u + .9986 r» . 

ft = —.6697 ru ■+■ .9986 r** ■+• 1.6200 r** . 
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which agrees, aside from the last place, with the result of the solution by de¬ 
terminants. 

It is wise to check in the original equations (1), (2), (3) as soon as any 0i is 
found. Row (14), for example, should be checked by showing 

(—.6697)(1.00) + (.9986)( .33) + (1.6200)( .21) = .0000 

(—.6697)( .33) + (.9985)( 1.00) + (1.6200)(-.48) = -.0001 

(—.6697)( .21) + (.9985)(—.48) + (1.6200)( 1.00) = 1.0001 

The same should be done with row (16) as soon as it is computed. Row (19) 
should be treated similarly. 

6. Many regression equations. If large numbers of regression equations are 
to be generated (the Strong Vocational Interest Study had 29 dependent va¬ 
riables), the following technique is suggested. Make a table with columns 
rvc, r n , etc. and use the rows to indicate the different values of k. On another 
slip of paper insert the general values ft» 0!, ft, • • • 0» in successive rows so 
that a folding of the paper will bring any general 0 expansion in conjunction 
with the r’s of any test, k. The scheme is illustrated in Table III. 


TABLE III 


No. 

Occupation 

rik 

i 

rtk 

rtk 


fiik 

i 

0tk 

0* 


r 

1 

Teacher 

1.00 

.33 

.21 


1.00 

.00 

.00 


1.00 

2 

Physicist 

.33 

1.00 

-.48 


.00 

1.00 

.00 


1.00 

3 

Office Worker 

.21 

-.48 

1.00 


.00 

.00 

1.00 


1.00 

4 

Doctor 

.17 

.79 

-.52 


-.03 

.72 

— .17 


.81 

u 5 

Lawyer 

-.02 

.16 

-.59 


.24 

-.30 

-.78 


.64 

f 6 

Engineer 

.16 

.78 

-.02 


— .37 

1.21 

.64 


.93 







T 






ft 

1.3990 

-.7831 

-.6697 


1.0000 

T ' 





ft 

-.7831 

1.7377 

j 

.9986 

i 



1.0000 

T 




ft 

-.6697 

.9986 

1.6200 




1.0000 



10 

Mathematician 

.46 

.96 

-.49 

I 

.19 

.82 

-.14 


.97 


etc. 











Thus, for the occupation of Engineer, 

ft - 1.3990 (.16) + (-.7831)(.78) + (-.6697)(-.02) . -.37 

ft = -.7831 (.16) + ( 1.7377)(.78) + ( .9996)(-.02) = 1.21 

0, = -.6697 (.16) + ( .9985)(.78) + ( 1.6200)(-.02) - .64 



230 


PAUL S. DWYER 


The value of the multiple correlation coefficient is then computed from the 
formula 

r* . 158 —n = Vift***!* + ft*+ .... + ft^r,** 

In the illustration above 

* y /(— 37)( 16) + (1.21)(.78) + (.64)(-.02) 

= .93 

7. Regression equations by deletion. The method of getting related regres¬ 
sion coefficients and correlation coefficients, described by Kurtz [3], is also 
applicable. Again, a problem involving more than three variables is needed to 
show the real value of the scheme but the technique may be illustrated in the 
three variable case. We wish to find, from the forward solution of Table II, 
the regression equation and the multiple correlation coefficient when the first two 
fundamental variables only are used. We delete all columns involving test 3 
and complete the back solution as indicated in Table IV, which may be viewed 
as a substitute for the last ten rows of Table II. 


TABLE IV 
(See Table II) 


Row 

Operation 

Pi 

th 

fit r\k rtk 

(20) 

Repeat (9) 


-1.0000 

-.37031 1.12221 

(21) 

— .3300 times (20) 


.3300 

.1222 -.37031 

(22) 

(5) + (21) 

-1.0000 


1.12221 -.3703 


The results are 

ft = 1.1222 ri k - .3703 r 2k . 
ft = -.3703 r u + 1.1222 r 2 *. 
and these agree with the results of section 4. 

8. The simplified back solution. In every case in which the fts have been 
given in terms of r’s the matrix of the coefficients is symmetric (sections 3,4, 5, 7). 
One wonders if this symmetry is generally true and if it holds for normal equa¬ 
tions of Type I or Type II. 

Determinants are much more useful in establishing general properties, such 
as the one under discussion, than they are in computing the values of regression 
coefficients in the case of a problem involving many variables. We return to the 
determinant notation of section 3. 

In each of the three types, and hence in the general case da = da so that D is a 
symmetric determinant, Du = D }i and ~ ^. Hence the matrix of the 

coefficients of the solution is symmetric. 
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This result may be'used (1) to check the expanded results or (2) to eliminate 
some of the work of the back solution. The n coefficients must be recorded for 
0» after which the column indicated by r„* may be dropped. The first n — 1 
coefficients must be computed for 0„-i after which the column indicated by 
may be dropped, etc. The italicized entries in Table II are the ones 
which are eliminated in this way. The remaining coefficients are sufficient to 
completely determine the symmetric matrix. 

The summary right hand check column can not be readily used in the simpli¬ 
fied back solution but it is hardly to be recommended anyway. Kurtz [3] 
argues against it on the ground that it is not necessary. The essential check is 
to see that each 0 solution satisfies all of the original equations. 

9. Conclusion. This paper provides a technique for the computation of 
general regression equations and shows how the technique may be combined 
with the Doolittle method in providing a practical means of mass prediction. 

University or Michigan. 
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CONSTITUTION 

Article I 

NAME AND PURPOSE 

1. This organization shall be known as the Institute of Mathematical Sta¬ 
tistics. 

2, Its object shall be to promote the interests of mathematical statistics. 

Article II 
MEMBERSHIP 

1. The membership of the Institute shall consist of Members, Fellows, 
Honorary Members, and Sustaining Members. 

2. Fellows shall be the only voting members of the Institute. 

Article III 

OFFICERS, BOARD OF DIRECTORS, COMMITTEE ON MEMBERSHIP, 
AND COMMITTEE ON PUBLICATIONS 

1. The Officers of the Institute shall be a President, two Vice-Presidents, 
and a Secretary-Treasurer, elected for a term of one year by a majority ballot 
at the annual meeting of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority 
vote of the individuals present at the organization meeting, and shall serve until 
December 31,1936. 

2. The Board of Directors of the Institute shall consist of the Officers and 
the previous President. 

3. The Institute shall have a Committee on Membership composed of three 
Fellows. At their first meeting subsequent to the adoption of this Constitution, 
the Board of Directors shall elect three members as Fellows to serve as the 
Committee on Membership, one member of the Committee for a term of one 
year, another for a term of two years, and another for a term of three years. 
Thereafter the Board of Directors shall elect from among the Fellows one 
member annually at their first meeting after their election for a term of three 
years. The president shall designate one of the Vice-Presidents as Chairman 
of this Committee. 

4. The Institute shall have a Committee on Publications composed of three 
Members or Fellows elected by the Board of Directors. The President shall 
designate a Vice-President as Ex Officio Chairman of this Committee. 
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Article IV 
MEETINGS 

1. A meeting for the presentation and discussion of papers, for the election of 
Officers, and for the transaction of other business of the Institute shall be held 
annually at such time as the Board of Directors may designate. Additional 
meetings may be called from time to time by the Board of Directors and shall be 
called at any time by the President upon written request from ten Fellows. 
Notice of the time and place of meeting shall be given to the membership by the 
Secretary-Treasurer at least thirty days prior to the date set for the meeting. 
All meetings except executive sessions shall be open to the public. Only 
papers accepted by a Program Committee appointed by the President may be 
presented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their 
election and again immediately before the expiration of their term. Other 
meetings of the Board may be held from time to time at the call of the President 
or any two members of the Board. Notice of each meeting of the Board, other 
than the two regular meetings, together with a statement of the business to be 
brought before the meeting, must be given to the members of the Board by the 
Secretary-Treasurer at least five days prior to the date set therefor. Should 
other business be passed upon, any member of the Board shall have the right to 
reopen the question at the next meeting. 

3. The Committee on Membership shall hold a meeting immediately after the 
annual meeting of the Institute. Further meetings of the Committee may be 
held from time to time at the call of the Chairman or any member of the Com¬ 
mittee provided notice of such call and the purpose of the meeting is given to 
the members of the Committee by the Secretary-Treasurer at least five days 
before the date set therefor. Should other business be passed upon, any 
member of the Committee shall have the right to reopen the question at the 
next meeting. 

4. At a regularly convened meeting of the Board of Directors, three members 
shall constitute a quorum. At a regularly convened meeting of the Committee 
on Membership, two members shall constitute a quorum. 

Article V 
PUBLICATIONS 

1. In the beginning, the “Annals of Mathematical Statistics” shall serve as 
the official journal for the Institute. Other publications may be originated 
by the Board of Directors as occasion arises. 

Article VI 

EXPULSION OR SUSPENSION 

1. Except for non-payment of dues, no one shall be expelled or suspended 
except by action of the Board of Directors with not more than one negative vote. 
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Akticle VII 
AMENDMENTS 

1. This constitution may be amended by an affirmative two-thirds vote at 
any regularly convened meeting of the Institute provided notice of such proposed 
amendment shall have been sent to each Fellow by the Secretary-Treasurer at 
least thirty days before the date of the meeting at which the proposal is to be 
acted upon. Voting may be in person or by mail. 

BY-LAWS 

Article I 

DUTIES OF THE OFFICERS, BOARD OF DIRECTORS, COMMITTEE 
ON MEMBERSHIP, AND COMMITTEE ON PUBLICATIONS 

1. The President, or in his absence, one of the Vice-Presidents, or in the 
absence of the President and both Vice-Presidents, a Fellow selected by vote 
of the Fellows present, shall preside at the meetings of the Institute and of the 
Board of Directors. At meetings of the Institute, the presiding officer shall 
vote only in the case of a tie, but at meetings of the Board of Directors he may 
vote in all cases. At least three months before the date of the annual meeting, 
the President shall appoint a Nominating Committee of three members. It 
shall be the duty of the Nominating Committee to make nominations for 
Officers to be elected at the annual meeting and the Secretary-Treasurer shall 
notify all Fellows at least thirty days before the annual meeting. Additional 
nominations may be submitted in writing, if signed by at least ten Fellows of 
the Institute, up to the time of the meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the 
proceedings at the meetings of the Institute and of the Board of Directors, 
send out calls for said meetings and, with the approval of the President and the 
Board, carry on the correspondence of the Institute. Subject to the direction 
of the Board, he shall have charge of the archives and other tangible and 
intangible property of the Institute. He shall send out calls for annual dues and 
acknowledge receipt of same; pay all bills approved by the President for expendi¬ 
tures authorized by the Board or the Institute; keep a detailed account of all 
receipts and expenditures, prepare a financial statement at the end of each year 
and present an abstract of the same at the annual meeting of the Institute after 
it has been audited by a Member or Fellow of the Institute appointed by the 
President as Auditor. The Auditor shall report to the President. 

3. The Board of Directors shall have charge of the funds and of the affairs 
of the Institute, with the exception of those affairs specifically assigned to the 
President or to the Committee on Membership. The Board shall have au¬ 
thority to fill all vacancies ad interim, occurring among the Officers, Board of 
Directors, or in any of the Committees. The Board may appoint such other 
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committees as may be required from time to time to carry on the affairs of the 
Institute. 

4. The Committee on Membership shall prepare and make available through 
the Secretary-Treasurer an announcement indicating the qualifications requisite 
for the different grades of membership. 

5. The Committee on Publications, under the general supervision of the 
Board of Directors, shall have charge of all matters connected with the publica¬ 
tions of the Institute, and of all books, pamphlets, manuscripts and other 
literary or scientific material collected by the Institute. Once a year this 
Committee shall cause to be printed in the Official Journal the Constitution 
and By-Laws and a classified list of all the Members and Fellows of the Institute. 

Article II 
DUES 

1. Members shall pay five dollars at the time of admission to membership 
and shall receive the full current volume of the Official Journal. Thereafter, 
Members shall pay five dollars annual dues. The annual dues of Fellows shall 
be five dollars. The annual dues of Sustaining Members shall be fifty dollars. 
Honorary Members shall be exempt from all dues. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow or Member include a subscription to the 
Official Journal. The annual dues of a Sustaining Member include two sub¬ 
scriptions to the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone 
whose dues may be six months in arrears, and to accompany such notice by a 
copy of this Article. If such person fail to pay such dues within three months 
from the date of mailing such notice, the Secretary-Treasurer shall report the 
delinquent one to the Board of Directors, by whom the person’s name may be 
stricken from the rolls and all privileges of membership withdrawn. Such 
person may, however, be re-instated by the Board of Directors upon payment 
of the arrears of dues. 


Article III 
SALARIES 

1. The Institute shall not pay a salary to any Officer, Director, or member of 
any committee. 


Article IV 
AMENDMENTS 

1. These By-Laws may be amended in the same manner as the Constitution 
or by a majority vote at any regularly convened meeting of the Institute, if the 
proposed amendment has been previously approved by the Board of Directors. 
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